diff --git a/dox/autodoc_diosapi.rst b/dox/_diosApi.rst similarity index 100% rename from dox/autodoc_diosapi.rst rename to dox/_diosApi.rst diff --git a/dox/conf.py b/dox/conf.py index 52c7a69e55a079490941e3635fbecf3d2985846a..4413f9967a1030cb915d2c4decc37158e2a43253 100644 --- a/dox/conf.py +++ b/dox/conf.py @@ -37,6 +37,7 @@ extensions = [ # "sphinx.ext.coverage", # "sphinx.ext.mathjax", # "sphinx.ext.ifconfig", + "sphinx.ext.autosectionlabel", # link source code "sphinx.ext.viewcode", @@ -57,6 +58,8 @@ automodsumm_inherited_members = True automodapi_inheritance_diagram = False automodapi_toctreedirnm = '_api' # automodsumm_writereprocessed = True +autosectionlabel_prefix_document = True + # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] diff --git a/dox/index.rst b/dox/index.rst index c5641f9e8fe35dff539eab275c264f9350c872e5..4145e362c2e60d2bed0fc0ba770e79764682d5bd 100644 --- a/dox/index.rst +++ b/dox/index.rst @@ -17,22 +17,24 @@ the class :class:`dios.DictOfSeries`. See For some recipes and advanced usage see: +.. toctree:: + + indexingDocs + .. toctree:: cookbook - indexing_help For full module api documentation see: .. toctree:: :maxdepth: 2 - autodoc_diosapi + _diosApi or browse the Index.. .. toctree:: - :hidden: genindex diff --git a/dox/indexing_help.md b/dox/indexingDocs.md similarity index 76% rename from dox/indexing_help.md rename to dox/indexingDocs.md index 533dc2a583b52435e23febd732c88e6d18ef7ab7..9a9f8306a897b668a870e497753a4e3fa662b596 100644 --- a/dox/indexing_help.md +++ b/dox/indexingDocs.md @@ -1,14 +1,92 @@ -Indexing with .aloc -=================== +Pandas-like indexing +==================== + +`[]` and `.loc[]`, `.iloc[]` and `.at[]`, `.iat[]` - should behave exactly like +their counter-parts from pandas.DataFrame. They can take as indexer +- lists, array-like objects and in general all iterables +- boolean lists and iterables +- slices +- scalars and any hashable object + +Most indexers are directly passed to the underling columns-series or row-series depending +on the position of the indexer and the complexity of the operation. For `.loc`, `.iloc`, `.at` +and `iat` the first position is the *row indexer*, the second the *column indexer*. The second +can be omitted and will default to `slice(None)`. Examples: +- `di.loc[[1,2,3], ['a']]` : select labels 1,2,3 from column a +- `di.iloc[[1,2,3], [0,3]]` : select positions 1,2,3 from the columns 0 and 3 +- `di.loc[:, 'a':'c']` : select all rows from columns a to d +- `di.at[4,'c']` : select the elements with label 4 in column c +- `di.loc[:]` -> `di.loc[:,:]` : select everything. + +Scalar indexing always return a pandas Series if the other indexer is a non-scalar. If both indexer +are scalars, the element itself is returned. In all other cases a dios is returned. +For more pandas-like indexing magic and the differences between the indexers, +see the [pandas documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html). -Purpose --------- -- select gracefully, so rows or columns, that was given as indexer, but doesn't exist, not raise an error +>**Note:** +> +>In contrast to pandas.DataFrame, `.loc[:]` and `.loc[:, :]` always behaves identical. Same apply for `iloc` and +>[`aloc`](#the-special-indexer-aloc). For example, two pandas.DataFrames `df1` and `df2` with different columns, +>does align columns with `df1.loc[:, :] = df2` , but does **not** with `df1.loc[:] = df2`. +> +>If this is the desired behavior or a bug, i couldn't verify so far. -- Bert Palm + +**2D-indexer** + +`dios[boolean dios-like]` (as single key) - dios accept boolean 2D-indexer (boolean pandas.Dataframe +or boolean Dios). + +Columns and rows from the indexer align with the dios. +This means that only matching columns selected and in this columns rows are selected where +i) indices are match and ii) the value is True in the indexer-bool-dios. There is no difference between +missing indices and present indices, but False values. + +Values from unselected rows and columns are dropped, but empty columns are still preserved, +with the effect that the resulting Dios always have the same column dimension than the initial dios. + +>**Note:** +>This is the exact same behavior like pandas.DataFrame's handling of 2D-indexer, despite that pandas.DataFrame +>fill numpy.nan's at missing locations and therefore also fill-up, whole missing columns with numpy.nan's. + +**setting values** + +Setting values with `[]` and `.loc[]`, `.iloc[]` and `.at[]`, `.iat[]` works like in pandas. +With `.at`/`.iat` only single items can be set, for the other the +right hand side values can be: + - *scalars*: these are broadcasted to the selected positions + - *lists*: the length the list must match the number of indexed columns. The items can be everything that + can applied to a series, with the respective indexing method (`loc`, `iloc`, `[]`). + - *dios*: the length of the columns must match the number of indexed columns - columns does *not* align, + they are just iterated. + Rows do align. Rows that are present on the right but not on the left are ignored. + Rows that are present on the left (bear in mind: these rows was explicitly chosen for write!), but not present + on the right, are filled with `NaN`s, like in pandas. + - *pandas.Series*: column indexer must be a scalar(!), the series is passed down, and set with `loc`, `iloc` or `[]` + by pandas Series, where it maybe align, depending on the method. + +**Examples:** + +- `dios.loc[2:5, 'a'] = [1,2,3]` is the same as `a=dios['a']; a.loc[2:5]=[1,2,3]; dios['a']=a` +- `dios.loc[2:5, :] = 99` : set 99 on rows 2 to 5 on all columns + +Special indexer `.aloc` +======================== + +Additional to the pandas like indexers we have a `.aloc[..]` (align locator) indexing method. +Unlike `.iloc` and `.loc` indexers fully align if possible and 1D-array-likes can be broadcast +to multiple columns at once. This method also handle missing indexer-items gracefully. +It is used like `.loc`, so a single indexer (`.aloc[indexer]`) or a tuple of row-indexer and +column-indexer (`.aloc[row-indexer, column-indexer]`) can be given. Also it can handle boolean and *non-bolean* +2D-Indexer. + +The main **purpose** of `.aloc` is: +- to select gracefully, so rows or columns, that was given as indexer, but doesn't exist, not raise an error - align series/dios-indexer -- setting multiple columns at once with a list-like value +- vertically broadcasting aka. setting multiple columns at once with a list-like value + +Aloc usage +---------- -Overview --------- `aloc` is *called* like `loc`, with a single key, that act as row indexer `aloc[rowkey]` or with a tuple of row indexer and column indexer `aloc[rowkey, columnkey]`. Also 2D-indexer (like dios or df) can be given, but only as a single key, like `.aloc[2D-indexer]` or with the special column key `...`, @@ -90,8 +168,8 @@ Values that are list- or array-like, which includes pd.Series, are set on all se like `s1.loc[:] = s2` do. See also the [cookbook](/docs/cookbook.md#broadcast-array-likes-to-multiple-columns). -Indexer Table -------------- +Aloc overiew table +--------------------- | example | type | on | like `.loc` | handling | conditions / hints | link | | ------- | ---- | --- | ----------- | -------- | ------------------ | ---- | @@ -147,9 +225,7 @@ Looks like so: Select columns, gracefully --------------------------- -**Single columns** - -Use `.aloc[:, key]` to select a single column gracefully. +One can use `.aloc[:, key]` to select **single columns** gracefully. The underling pandas.Series is returned, if the key exist. Otherwise a empty pandas.Series with `dtype=object` is returned.