ak.layout.RegularArray

The RegularArray class describes lists that all have the same length, the single integer size. Its underlying content is a flattened view of the data; that is, each list is not stored separately in memory, but is inferred as a subinterval of the underlying data.

If the content length is not an integer multiple of size, then the length of the RegularArray is truncated to the largest integer multiple.

An extra field zeros_length is ignored unless the size is zero. This sets the length of the RegularArray in only those cases, so that it is possible for an array to contain a non-zero number of zero-length lists with regular type.

A multidimensional ak.layout.NumpyArray is equivalent to a one-dimensional ak.layout.NumpyArray nested within several RegularArrays, one for each dimension. However, RegularArrays can be used to make lists of any other type.

RegularArray corresponds to an Apache Arrow Tensor.

Below is a simplified implementation of a RegularArray class in pure Python that exhaustively checks validity in its constructor (see ak.is_valid) and can generate random valid arrays. The random_number() function returns a random float and the random_length(minlen) function returns a random int that is at least minlen. The RawArray class represents simple, one-dimensional data.

class RegularArray(Content):
    def __init__(self, content, size, zeros_length=0):
        assert isinstance(content, Content)
        assert isinstance(size, int)
        assert isinstance(zeros_length, int)
        assert size >= 0
        if size != 0:
            length = len(self.content) // self.size   # floor division
        else:
            assert zeros_length >= 0
            length = zeros_length
        self.content = content
        self.size = size
        self.length = length

    @staticmethod
    def random(minlen, choices):
        size = random_length(0, 5)
        zeros_length = random_length(0, 5)
        content = random.choice(choices).random(random_length(minlen) * size, choices)
        return RegularArray(content, size, zeros_length)

    def __len__(self):
        return self.length

    def __getitem__(self, where):
        if isinstance(where, int):
            assert 0 <= where < len(self)
            return self.content[(where) * self.size:(where + 1) * self.size]
        elif isinstance(where, slice) and where.step is None:
            start = where.start * self.size
            stop = where.stop * self.size
            zeros_length = where.stop - where.start
            return RegularArray(self.content[start:stop], self.size, zeros_length)
        elif isinstance(where, str):
            return RegularArray(self.content[where], self.size, self.length)
        else:
            raise AssertionError(where)

    def __repr__(self):
        if size == 0:
            zeros_length = ", " + repr(self.length)
        else:
            zeros_length = ""
        return "RegularArray(" + repr(self.content) + ", " + repr(self.size) + zeros_length + ")"

    def xml(self, indent="", pre="", post=""):
        out = indent + pre + "<RegularArray>\n"
        out += self.content.xml(indent + "    ", "<content>", "</content>\n")
        out += indent + "    <size>" + str(self.size) + "</size>\n"
        if size == 0:
            out += indent + "    <zeros_length>" + str(self.length) + "</zeros_length>\n"
        out += indent + "</RegularArray>" + post
        return out

Here is an example:

RegularArray(RawArray([7.4, -0.0, 6.6, 6.6, 5.2, 4.6, 9.6, 4.2, 2.3, 6.5, 4.2, 1.3, 2.2, 4.1,
                       1.9, 3.9, 2.3, 2.3, 0.7, 6.9, 1.4, 9.6, 11.8, 6.8, 8.2, 10.5, 8.2,
                       7.5, 6.3, 5.4, 0.5, 1.0, 5.5, 4.1, 5.9, 7.9, 6.7, 7.3, 5.6, 5.5, 2.2,
                       2.2, -0.3, 3.5, 11.2, 13.4, 6.7, -1.0, 6.4, 1.3, 6.8, 5.1, 3.2, 9.5,
                       2.8]),
             5)
<RegularArray>
    <content><RawArray>
        <ptr>7.4 -0.0 6.6 6.6 5.2 4.6 9.6 4.2 2.3 6.5 4.2 1.3 2.2 4.1 1.9 3.9 2.3 2.3 0.7 6.9
             1.4 9.6 11.8 6.8 8.2 10.5 8.2 7.5 6.3 5.4 0.5 1.0 5.5 4.1 5.9 7.9 6.7 7.3 5.6
             5.5 2.2 2.2 -0.3 3.5 11.2 13.4 6.7 -1.0 6.4 1.3 6.8 5.1 3.2 9.5 2.8</ptr>
    </RawArray></content>
    <size>5</size>
</RegularArray>

which represents the following logical data.

[[7.4, -0.0, 6.6, 6.6, 5.2],
 [4.6, 9.6, 4.2, 2.3, 6.5],
 [4.2, 1.3, 2.2, 4.1, 1.9],
 [3.9, 2.3, 2.3, 0.7, 6.9],
 [1.4, 9.6, 11.8, 6.8, 8.2],
 [10.5, 8.2, 7.5, 6.3, 5.4],
 [0.5, 1.0, 5.5, 4.1, 5.9],
 [7.9, 6.7, 7.3, 5.6, 5.5],
 [2.2, 2.2, -0.3, 3.5, 11.2],
 [13.4, 6.7, -1.0, 6.4, 1.3],
 [6.8, 5.1, 3.2, 9.5, 2.8]]

In addition to the properties and methods described in ak.layout.Content, a RegularArray has the following.

ak.layout.RegularArray.__init__

ak.layout.RegularArray.__init__(content, size, identities=None, parameters=None)

ak.layout.RegularArray.content

ak.layout.RegularArray.content

ak.layout.RegularArray.size

ak.layout.RegularArray.size

ak.layout.RegularArray.compact_offsets64

ak.layout.RegularArray.compact_offsets64(start_at_zero=True)

Returns a 64-bit ak.layout.Index of offsets by prefix summing in steps of size.

ak.layout.RegularArray.broadcast_tooffsets64

ak.layout.RegularArray.broadcast_tooffsets64(offsets)

Shifts contents to match a given set of offsets (if possible) and returns a ak.layout.ListOffsetArray with the results. This is used in broadcasting because a set of ak.types.ListType and ak.types.RegularType arrays have to be reordered to a common offsets before they can be directly operated upon.

ak.layout.RegularArray.simplify

ak.layout.RegularArray.simplify()

Pass-through; returns the original array.