Skip to content
Snippets Groups Projects

aloc usage

Purpose

  • select gracefully, so rows or columns, that was given as indexer, but doesn't exist, not raise an error
  • align series/dios-indexer
  • setting multiple columns at once with a list-like value

Overview

aloc is called like loc, with a single key, that act as row indexer aloc[rowkey] or with a tuple of row indexer and column indexer aloc[rowkey, columnkey]. Also 2D-indexer (like dios or df) can be given, but only as a single key, like .aloc[2D-indexer] or with the special column key ..., the ellipsis (.aloc[2D-indexer, ...]). The ellipsis may change, how the 2D-indexer is interpreted, but this will explained later in detail.

If a normal (non 2D-dimensional) row indexer is given, but no column indexer, the latter defaults to : aka. slice(None), so .aloc[row-indexer] becomes .aloc[row-indexer, :], which means, that all columns are used. In general, a normal row-indexer is applied to every column, that was chosen by the column indexer, but for each column separately.

So maybe a first example gives an rough idea:

>> d
    a |     b |     c |     d | 
===== | ===== | ===== | ===== | 
0  66 | 2  77 | 0  88 | 1  99 | 
1  66 | 3  77 | 1  88 | 2  99 | 


>> d.aloc[[1,2], ['a', 'b', 'd', 'x']]
    a |     b |     d | 
===== | ===== | ===== | 
1  66 | 2  77 | 1  99 | 
      |       | 2  99 | 

The return type

Unlike the other two indexer methods loc and iloc, it is not possible to get a single item returned; the return type is either a pandas.Series, iff the column-indexer is a single key (eg. 'a') or a dios, iff not. The row-indexer does not play any role in the return type choice.

Note for the curios:

This is because a scalar (.aloc[key]) is translates to .loc[key:key] under the hood.

Indexer types

Following the .aloc specific indexer are listed. Any indexer that is not listed below (slice, boolean lists, ...), but are known to work with .loc, are treated as they would passed to .loc, as they actually do under the hood.

Some indexer are linked to later sections, where a more detailed explanation and examples are given.

special Column indexer are :

  • list / array-like (or any iterable object): Only labels that are present in the columns are used, others are ignored.
  • pd.Series : .values are taken from series and handled like a list.
  • scalar (or any hashable obj) : Select a single column, if label is present, otherwise nothing.

special Row indexer are :

  • list / array-like (or any iterable object): Only rows, which indices are present in the index of the column are used, others are ignored. A dios is returned.
  • scalar (or any hashable obj) : Select a single row from a column, if the value is present in the index of the column, otherwise nothing is selected. [1]
  • pd.Series : align the index from the given Series with the column, what means only common indices are used. The actual values of the series are ignored(!).
  • boolean pd.Series : like pd.Series but only True values are evaluated. False values are equivalent to missing indices. To treat a boolean series as a normal indexer series, as decribed above, one can use .aloc(usebool=False)[boolean pd.Series].

special 2D-indexer are :

  • .aloc[boolean dios-like] : work same like di[boolean dios-like] (see there). Brief: full align, select items, where the index is present and the value is True.
  • .aloc[dios-like, ...] (with Ellipsis) : Align in columns and rows, ignore its values. Per common column, the common indices are selected. The ellipsis forces aloc, to ignore the values, so a boolean dios could be treated as a non-boolean. Alternatively .aloc(usebool=False)[boolean dios-like] could be used.[2]
  • .aloc[nested list-like] : The inner lists are used as aloc-list-row-indexer (see there) on all columns. One list for one column, which implies, that the outer list has the same length as the number of columns.

special handling of 1D-values

Values that are list- or array-like, which includes pd.Series, are set on all selected columns. pd.Series align like s1.loc[:] = s2 do. See also the cookbook.

Indexer Table

example type on like .loc handling conditions / hints link
Column indexer
.aloc[any, 'a'] scalar columns no select graceful - cols
.aloc[any, 'b':'z'] slice columns yes slice - cols
.aloc[any, ['a','c']] list-like columns no filter graceful - cols
.aloc[any [True,False]] bool list-like columns yes take True's length must match nr of columns cols
.aloc[any, s] pandas.Series columns no like list, only s.values are evaluated cols
.aloc[any, bs] bool pandas.Series columns yes like bool-list see there cols
Row indexer
.aloc[7, any] scalar rows no translate to .loc[key:key] - rows
.aloc[3:42, any] slice rows yes slice -
.aloc[[1,2,24], any] list-like rows no filter graceful - rows
.aloc[[True,False], any] bool list-like rows yes take True's length must match nr of (all selected) columns blist
.aloc[s, any] pandas.Series rows no like .loc[s.index] - ser
.aloc[bs, any] bool pandas.Series rows no align + just take True's evaluate usebool-keyword ser
.aloc[[[s],[1,2,3]], any] nested list-like both ? one row-indexer per column outer length must match nr of (selected) columns nlist
2D-indexer
.aloc[di] dios-like both no full align -
.aloc[di, ...] dios-like both no full align ellipsis has no effect
.aloc[di>5] bool dios-like both no full align + take True's evaluate usebool-keyword
.aloc[di>5, ...] (bool) dios-like both no full align, no bool evaluation -

Example dios

The dios used in the examples, unless stated otherwise:

>>> d
    a |    b |     c |     d | 
===== | ==== | ===== | ===== | 
0   0 | 2  5 | 4   7 | 6   0 | 
1   7 | 3  6 | 5  17 | 7   1 | 
2  14 | 4  7 | 6  27 | 8   2 | 
3  21 | 5  8 | 7  37 | 9   3 | 
4  28 | 6  9 | 8  47 | 10  4 | 

Select columns, gracefully

Single columns

Use .aloc[:, key] to select a single column gracefully. The underling pandas Series is returned, if the key exist. Otherwise a empty pd.Series with dtype=object is returned.

>>> d.aloc[:, 'a']
0     0
1     7
2    14
3    21
4    28
Name: a, dtype: int64

>>> d.aloc[:, 'x']
Series([], dtype: object)

Multiple columns

Just like selecting single columns gracefully, but with a array-like indexer. A dios is returned, with a subset of the existing columns. If no key is present a empty dios is returned.

>>> d.aloc[:, ['c', 99, None, 'a', 'x', 'y']]
    a |     c | 
===== | ===== | 
0   0 | 4   7 | 
1   7 | 5  17 | 
2  14 | 6  27 | 
3  21 | 7  37 | 
4  28 | 8  47 | 

>>> d.aloc[:, ['x', 'y']]
Empty DictOfSeries
Columns: []

s = pd.Series(dict(a='a', b='x', c='c', foo='d'))
d.aloc[:, s]
    a |     c |     d | 
===== | ===== | ===== | 
0   0 | 4   7 | 6   0 | 
1   7 | 5  17 | 7   1 | 
2  14 | 6  27 | 8   2 | 
3  21 | 7  37 | 9   3 | 
4  28 | 8  47 | 10  4 | 

Boolean indexing, indexing with pd.Series and slice indexer

Boolean indexer, for example [True, 'False', 'True', 'False'], must have the same length than the number of columns, then only columns, where the indexer has a True value are selected.

If the key is a pandas.Series, its values are used for indexing, especially the Series's index is ignored. If a series has boolean values its treated like a boolean indexer, otherwise its treated as a array-like indexer.

A easy way to select all columns, is, to use null-slicees, like .aloc[:,:] or even simpler .aloc[:]. This is just like one would do, with loc or iloc. Of course slicing with boundaries also work, eg .loc[:, 'a':'f'].

See also

Selecting Rows a smart way

For scalar and array-like indexer with label values, the keys are handled gracefully, just like with array-like column indexers.

>>> d.aloc[1]
   a |       b |       c |       d | 
==== | ======= | ======= | ======= | 
1  7 | no data | no data | no data | 

>>> d.aloc[99]
Empty DictOfSeries
Columns: ['a', 'b', 'c', 'd']

>>> d.aloc[[3,6,7,18]]
    a |    b |     c |    d | 
===== | ==== | ===== | ==== | 
3  21 | 3  6 | 6  27 | 6  0 | 
      | 6  9 | 7  37 | 7  1 | 

The length of columns can differ:

>>> d.aloc[[3,6,7,18]].aloc[[3,6]]
    a |    b |     c |    d | 
===== | ==== | ===== | ==== | 
3  21 | 3  6 | 6  27 | 6  0 | 
      | 6  9 |       |      | 

Boolean array-likes as row indexer

For array-like indexer that hold boolean values, the length of the indexer and the length of all column(s) to index must match.

>>> d.aloc[[True,False,False,True,False]]
    a |    b |     c |    d | 
===== | ==== | ===== | ==== | 
0   0 | 2  5 | 4   7 | 6  0 | 
3  21 | 5  8 | 7  37 | 9  3 | 

If the length does not match a IndexError is raised:

>>> d.aloc[[True,False,False]]
Traceback (most recent call last):
  ...
  f"Boolean index has wrong length: "
IndexError: failed for column a: Boolean index has wrong length: 3 instead of 5

This can be tricky, especially if columns have different length:

>>> difflen
    a |    b |     c |    d | 
===== | ==== | ===== | ==== | 
0   0 | 2  5 | 4   7 | 6  0 | 
1   7 | 3  6 | 6  27 | 7  1 | 
2  14 | 4  7 |       | 8  2 | 

>>> difflen.aloc[[False,True,False]]
Traceback (most recent call last):
  ...
  f"Boolean index has wrong length: "
IndexError: Boolean index has wrong length: 3 instead of 2

pandas.Series and boolean pandas.Series as row indexer

When using a pandas.Series as row indexer with aloc, all its magic comes to light. The index of the given series align itself with the index of each column separately and is this way used as a filter.

>>> s = d['b'] + 100
>>> s
2    105
3    106
4    107
5    108
6    109
Name: b, dtype: int64

>>> d.aloc[s]
    a |    b |     c |    d | 
===== | ==== | ===== | ==== | 
2  14 | 2  5 | 4   7 | 6  0 | 
3  21 | 3  6 | 5  17 |      | 
4  28 | 4  7 | 6  27 |      | 
      | 5  8 |       |      | 
      | 6  9 |       |      | 

As seen in the example above the series' values are ignored completely. The functionality
is similar to s1.loc[s2.index], with s1 and s2 are pandas.Series's, and s2 is the indexer and s1 is one column after the other.

If the indexer series holds boolean values, these are not ignored. The series align the same way as explained above, but additional only the True values are evaluated. Thus False-values are treated like missing indices. The behavior here is analogous to s1.loc[s2[s2].index].

>>> boolseries = d['b'] > 6
>>> boolseries
2    False
3    False
4     True
5     True
6     True
Name: b, dtype: bool

>>> d.aloc[boolseries]
    a |    b |     c |    d | 
===== | ==== | ===== | ==== | 
4  28 | 4  7 | 4   7 | 6  0 | 
      | 5  8 | 5  17 |      | 
      | 6  9 | 6  27 |      | 

To evaluate boolean values is a very handy feature, as it can easily used with multiple conditions and also fits nicely with writing those as one-liner:

>>> d.aloc[d['b'] > 6]
    a |    b |     c |    d | 
===== | ==== | ===== | ==== | 
4  28 | 4  7 | 4   7 | 6  0 | 
      | 5  8 | 5  17 |      | 
      | 6  9 | 6  27 |      | 

>>> d.aloc[(d['a'] > 6) & (d['b'] > 6)]
    a |    b |    c |       d | 
===== | ==== | ==== | ======= | 
4  28 | 4  7 | 4  7 | no data | 

Note:

Nevertheless, something like d.aloc[d['a'] > d['b']] do not work, because the comparison fails, as long as the two series objects not have the same index. But maybe one want to checkout DictOfSeries.index_of().

Nested-lists as row indexer

It is possible to pass different array-like indexer to different columns, by using nested lists as indexer. The outer list's length must match the number of columns of the dios. The items of the outer list, all must be array-like and not further nested. For example list, pandas.Series, boolean lists or pandas.Series, numpy.arrays... Every inner list-like item is applied as row indexer to the according column.

>>> d
    a |    b |     c |     d | 
===== | ==== | ===== | ===== | 
0   0 | 2  5 | 4   7 | 6   0 | 
1   7 | 3  6 | 5  17 | 7   1 | 
2  14 | 4  7 | 6  27 | 8   2 | 
3  21 | 5  8 | 7  37 | 9   3 | 
4  28 | 6  9 | 8  47 | 10  4 | 

>>> d.aloc[ [d['a'], [True,False,True,False,False], [], [7,8,10]] ]
    a |    b |       c |     d | 
===== | ==== | ======= | ===== | 
0   0 | 2  5 | no data | 7   1 | 
1   7 | 4  7 |         | 8   2 | 
2  14 |      |         | 10  4 | 
3  21 |      |         |       | 
4  28 |      |         |       | 

>>> ar = np.array([2,3])
>>> d.aloc[[ar, ar+1, ar+2, ar+3]]
    a |    b |     c |    d | 
===== | ==== | ===== | ==== | 
2  14 | 3  6 | 4   7 | 6  0 | 
3  21 | 4  7 | 5  17 |      | 

Even this looks like a 2D-indexer, that are explained in the next section, it is not. In contrast to the 2D-indexer, we also can provide a column key, to pre-filter the columns.

>>> d.aloc[[ar, ar+1, ar+3], ['a','b','d']]
    a |    b |    d | 
===== | ==== | ==== | 
2  14 | 3  6 | 6  0 | 
3  21 | 4  7 |      | 

The power of 2D-indexer

Overview:

.aloc[bool-dios] 1. align columns, 2. align rows, 3. just take True's -- [1]
.aloc[dios, ...] (use Ellipsis) 1. align columns, 2. align rows, (3.) ignore values -- [1]
[1] evaluate usebool-keyword

T_O_D_O