The IndexedArray concept is implemented in 3 specialized classes:

  • ak.layout.IndexArray64: index values are 64-bit signed integers.

  • ak.layout.IndexArray32: index values are 32-bit signed integers.

  • ak.layout.IndexArrayU32: index values are 32-bit unsigned integers.

The IndexedArray class is a general-purpose tool for changing the order of and/or duplicating some content. Its index array is a lazily applied np.take (integer-array slice, also known as “advanced indexing”).

It has many uses:

  • representing a lazily applied slice.

  • simulating pointers into another collection.

  • emulating the dictionary encoding of Apache Arrow and Parquet.

IndexedArray doesn’t have a direct equivalent in Apache Arrow.

Below is a simplified implementation of a IndexedArray class in pure Python that exhaustively checks validity in its constructor (see ak.is_valid) and can generate random valid arrays. The random_number() function returns a random float and the random_length(minlen) function returns a random int that is at least minlen. The RawArray class represents simple, one-dimensional data.

class IndexedArray(Content):
    def __init__(self, index, content):
        assert isinstance(index, list)
        assert isinstance(content, Content)
        for x in index:
            assert isinstance(x, int)
            assert 0 <= x < len(content)   # index[i] must not be negative
        self.index = index
        self.content = content

    def random(minlen, choices):
        if minlen == 0:
            content = random.choice(choices).random(0, choices)
            content = random.choice(choices).random(1, choices)
        if len(content) == 0:
            index = []
            index = [random.randint(0, len(content) - 1)
                         for i in range(random_length(minlen))]
        return IndexedArray(index, content)

    def __len__(self):
        return len(self.index)

    def __getitem__(self, where):
        if isinstance(where, int):
            assert 0 <= where < len(self)
            return self.content[self.index[where]]
        elif isinstance(where, slice) and where.step is None:
            return IndexedArray(self.index[where.start:where.stop], self.content)
        elif isinstance(where, str):
            return IndexedArray(self.index, self.content[where])
            raise AssertionError(where)

    def __repr__(self):
        return "IndexedArray(" + repr(self.index) + ", " + repr(self.content) + ")"

    def xml(self, indent="", pre="", post=""):
        out = indent + pre + "<IndexedArray>\n"
        out += indent + "    <index>" + " ".join(str(x) for x in self.index) + "</index>\n"
        out += self.content.xml(indent + "    ", "<content>", "</content>\n")
        out += indent + "</IndexedArray>\n"
        return out

Here is an example:

IndexedArray([3, 5, 1, 1, 5, 3],
             RawArray([8.9, 3.2, 5.4, 9.8, 7.5, 1.9]))
    <index>3 5 1 1 5 3</index>
        <ptr>8.9 3.2 5.4 9.8 7.5 1.9</ptr>

which represents the following logical data.

[9.8, 1.9, 3.2, 3.2, 1.9, 9.8]

In addition to the properties and methods described in ak.layout.Content, an IndexedArray has the following.


ak.layout.IndexedArray.__init__(index, content, identities=None, parameters=None)







Returns False because this is not an IndexedOptionArray.



Returns an array with the index applied to reorder/duplicate elements.

If mask is a signed 8-bit ak.layout.Index in which 0 means valid and 1 means missing, only valid elements according to this mask are returned.



Returns an 8-bit signed ak.layout.Index of all zeros, because this IndexedArray does not have ak.types.OptionType.



Combines this node with its content if the content also has ak.types.OptionType or is an ak.layout.IndexedArray; otherwise, this is a pass-through. In all cases, the output has the same logical meaning as the input.

This method only operates one level deep.