From 0a7041c33e069252a779f6bc32c0dea23b4d6459 Mon Sep 17 00:00:00 2001 From: Bert Palm <bert.palm@ufz.de> Date: Thu, 26 Mar 2020 17:03:37 +0100 Subject: [PATCH] more doku --- docs/aloc_usage.md | 115 +++++++++++++++++++++++++++++++++------------ 1 file changed, 86 insertions(+), 29 deletions(-) diff --git a/docs/aloc_usage.md b/docs/aloc_usage.md index 9f86e1c..6fe5f9e 100644 --- a/docs/aloc_usage.md +++ b/docs/aloc_usage.md @@ -85,31 +85,32 @@ Some indexer are linked to later sections, where a more detailed explanation and Values that are list- or array-like, which includes pd.Series, are set on all selected columns. pd.Series align like `s1.loc[:] = s2` do. See also the [cookbook](/docs/cookbook.md#broadcast-array-likes-to-multiple-columns). -*Indexer Table* -| example | type | on | like `.loc` | handling | conditions | link | -| ------ | ------ | ------ | ------ | ------ | ------ | ------ | +Indexer Table +------------- + +| example | type | on | like `.loc` | handling | conditions / hints | link | +| ------- | ---- | --- | ----------- | -------- | ------------------ | ---- | |[Column indexer](#select-columns-gracefully)| -| `.aloc[any, ['a']]` | scalar | columns |no | select graceful | - | [link](#select-columns-gracefully)| -| `.aloc[any, 'b':'z']` | slice | columns |yes| slice | - | [link](#select-columns-gracefully)| -| `.aloc[any, ['a','c']]` | list-like | columns |no | filter graceful | - | [link](#select-columns-gracefully)| -| `.aloc[any [True,False]]` | bool list-like | columns |yes| take `True`'s , length must match (!) | - | [link](#select-columns-gracefully)| -| `.aloc[any, s]` | pandas.Series | columns |no | like list, only values | - | | -| `.aloc[any, bs]` | bool pandas.Series | columns |yes| like bool-list | - | | +| `.aloc[any, ['a']]` | scalar | columns |no | select graceful | - | [cols](#select-columns-gracefully)| +| `.aloc[any, 'b':'z']` | slice | columns |yes| slice | - | [cols](#select-columns-gracefully)| +| `.aloc[any, ['a','c']]` | list-like | columns |no | filter graceful | - | [cols](#select-columns-gracefully)| +| `.aloc[any [True,False]]` | bool list-like | columns |yes| take `True`'s | length must match nr of columns | [cols](#select-columns-gracefully)| +| `.aloc[any, s]` | pandas.Series | columns |no | like list, | only `s.values` are evaluated | [cols](#select-columns-gracefully)| +| `.aloc[any, bs]` | bool pandas.Series | columns |yes| like bool-list | see there | [cols](#select-columns-gracefully)| |[Row indexer](#selecting-rows-a-smart-way)| -| `.aloc[7, any]` | scalar | rows |no | translate to `.loc[key:key]` | - | | -| `.aloc[3:42, any]` | slice | rows |yes| slice | - | | -| `.aloc[[1,2,24], any]` | list-like | rows |no | filter graceful | - | | -| `.aloc[[True,False], any]` | bool list-like | rows |yes| take `True`'s, length must match nr of (all selected) columns (!) | - | | -| `.aloc[s, any]` | pandas.Series | rows |no | like `.loc[s.index]` | - | | -| `.aloc[bs, any]` | bool pandas.Series | rows |no | align + just take `True`'s, [1] | - | | -| `.aloc[[[s],[1,2,3]], any]` | nested list-like | both | ? | one row-indexer per column, outer length must match nr of (selected) columns(!) | - | | +| `.aloc[7, any]` | scalar | rows |no | translate to `.loc[key:key]` | - | [rows](#selecting-rows-a-smart-way) | +| `.aloc[3:42, any]` | slice | rows |yes| slice | - | | +| `.aloc[[1,2,24], any]` | list-like | rows |no | filter graceful | - | [rows](#selecting-rows-a-smart-way) | +| `.aloc[[True,False], any]` | bool list-like | rows |yes| take `True`'s | length must match nr of (all selected) columns | [blist](#boolean-array-likes-as-row-indexer)| +| `.aloc[s, any]` | pandas.Series | rows |no | like `.loc[s.index]` | - | [ser](#pandasseries-and-boolean-pandasseries-as-row-indexer) | +| `.aloc[bs, any]` | bool pandas.Series | rows |no | align + just take `True`'s | evaluate `usebool`-keyword | [ser](#pandasseries-and-boolean-pandasseries-as-row-indexer)| +| `.aloc[[[s],[1,2,3]], any]` | nested list-like | both | ? | one row-indexer per column | outer length must match nr of (selected) columns | [nlist](#nested-lists-as-row-indexer) | |[2D-indexer](#the-power-of-2d-indexer)| | `.aloc[di]` | dios-like | both |no | full align | - | | -| `.aloc[di, ...]` | dios-like | both |no | full align, ellipsis has no effect | - | | -| `.aloc[di>5]` | bool dios-like | both |no | full align + take `True`'s [1] | - | | -| `.aloc[di>5, ...]` | (bool) dios-like | both |no | full align, disable bool evaluation | - | | -[1] evaluate `usebool`-keyword +| `.aloc[di, ...]` | dios-like | both |no | full align | ellipsis has no effect | | +| `.aloc[di>5]` | bool dios-like | both |no | full align + take `True`'s | evaluate `usebool`-keyword | | +| `.aloc[di>5, ...]` | (bool) dios-like | both |no | full align, **no** bool evaluation | - | | Example dios ============ @@ -130,7 +131,7 @@ The dios used in the examples, unless stated otherwise: Select columns, gracefully =========================== -**single columns** +**Single columns** Use `.aloc[:, key]` to select a single column gracefully. The underling pandas Series is returned, if the key exist. @@ -150,16 +151,12 @@ Series([], dtype: object) ``` -**multiple columns** +**Multiple columns** Just like selecting *single columns gracefully*, but with a array-like indexer. A dios is returned, with a subset of the existing columns. If no key is present a empty dios is returned. -If the key is a pandas.Series, its *values* are used for indexing, especially the Series's index is ignored. - -To select all columns simply use `.aloc[:,:]` or even simpler `.aloc[:]`, just like one would do with `loc` or `iloc`. - ``` >>> d.aloc[:, ['c', 99, None, 'a', 'x', 'y']] a | c | @@ -185,6 +182,24 @@ d.aloc[:, s] 4 28 | 8 47 | 10 4 | ``` +**Boolean indexing, indexing with pd.Series and slice indexer** + +**Boolean indexer**, for example `[True, 'False', 'True', 'False']`, must have the same length than the number +of columns, then only columns, where the indexer has a `True` value are selected. + +If the key is a **pandas.Series**, its *values* are used for indexing, especially the Series's index is ignored. If a +series has boolean values its treated like a boolean indexer, otherwise its treated as a array-like indexer. + +A easy way to select all columns, is, to use null-**slice**es, like `.aloc[:,:]` or even simpler `.aloc[:]`. +This is just like one would do, with `loc` or `iloc`. Of course slicing with boundaries also work, +eg `.loc[:, 'a':'f']`. + +For more information about boolean or slice indexing see the pandas documentation +[Slicing ranges](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#slicing-ranges) +and +[Boolean indexing](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-indexing) + + Selecting Rows a smart way ========================== @@ -328,9 +343,51 @@ as long as the two series objects not have the same index. But maybe one want to [DictOfSeries.index_of()](/docs/methods_and_properties.md#diosdictofseriesindex_of). -**T_O_D_O - nested lists** -- `.aloc[nested list]` - use a own list-like row indexer on each columns -- sublists can be series, bool-series, lists or bool-lists +Nested-lists as row indexer +--------------------------- + +It is possible to pass different array-like indexer to different columns, by using nested lists as indexer. +The outer list's length must match the number of columns of the dios. The items of the outer list, all must be +array-like and not further nested. For example list, pandas.Series, boolean lists or pandas.Series, numpy.arrays... +Every inner list-like item is applied as row indexer to the according column. + +``` +>>> d + a | b | c | d | +===== | ==== | ===== | ===== | +0 0 | 2 5 | 4 7 | 6 0 | +1 7 | 3 6 | 5 17 | 7 1 | +2 14 | 4 7 | 6 27 | 8 2 | +3 21 | 5 8 | 7 37 | 9 3 | +4 28 | 6 9 | 8 47 | 10 4 | + +>>> d.aloc[ [d['a'], [True,False,True,False,False], [], [7,8,10]] ] + a | b | c | d | +===== | ==== | ======= | ===== | +0 0 | 2 5 | no data | 7 1 | +1 7 | 4 7 | | 8 2 | +2 14 | | | 10 4 | +3 21 | | | | +4 28 | | | | + +>>> ar = np.array([2,3]) +>>> d.aloc[[ar, ar+1, ar+2, ar+3]] + a | b | c | d | +===== | ==== | ===== | ==== | +2 14 | 3 6 | 4 7 | 6 0 | +3 21 | 4 7 | 5 17 | | +``` + +Even this looks like a 2D-indexer, that are explained in the next section, it is not. +In contrast to the 2D-indexer, we also can provide a column key, to pre-filter the columns. + +``` +>>> d.aloc[[ar, ar+1, ar+3], ['a','b','d']] + a | b | d | +===== | ==== | ==== | +2 14 | 3 6 | 6 0 | +3 21 | 4 7 | | +``` -- GitLab