Skip to content
Snippets Groups Projects
Commit 9983b157 authored by Bert Palm's avatar Bert Palm 🎇
Browse files

adjusted ReadMe

parent f91b4193
No related branches found
No related tags found
No related merge requests found
......@@ -20,6 +20,27 @@ Features
* behaves quite like a pandas.DataFrame
* additional align locator (`.aloc[]`)
Install
-------
todo: PyPi
```
import dios
# Have fun :)
```
Documentation
-------------
todo: link to ReadTheDocs
Local docs about:
* [Indexing](/docs/doc_indexing.md)
* [Cookbook](/docs/doc_cookbook.md)
* [Itype](/docs/doc_itype.md)
TL;DR
-----
**get it**
......@@ -77,157 +98,3 @@ Columns: ['x', 'y']
2 spam | |
```
Pandas-like indexing
--------------------
`[]` and `.loc[]`, `.iloc[]` and `.at[]`, `.iat[]` - should behave exactly like
their counter-parts from pandas.DataFrame. They can take as indexer
- lists, array-like objects and in general all iterables
- boolean lists and iterables
- slices
- scalars and any hashable object
Most indexers are directly passed to the underling columns-series or row-series depending
on the position of the indexer and the complexity of the operation. For `.loc`, `.iloc`, `.at`
and `iat` the first position is the *row indexer*, the second the *column indexer*. The second
can be omitted and will default to `slice(None)`. Examples:
- `di.loc[[1,2,3], ['a']]` : select labels 1,2,3 from column a
- `di.iloc[[1,2,3], [0,3]]` : select positions 1,2,3 from the columns 0 and 3
- `di.loc[:, 'a':'c']` : select all rows from columns a to d
- `di.at[4,'c']` : select the elements with label 4 in column c
- `di.loc[:]` -> `di.loc[:,:]` : select everything.
Scalar indexing always return a pandas Series if the other indexer is a non-scalar. If both indexer
are scalars, the element itself is returned. In all other cases a dios is returned.
For more pandas-like indexing magic and the differences between the indexers,
see the [pandas documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html).
>**Note:**
>
>In contrast to pandas.DataFrame, `.loc[:]` and `.loc[:, :]` always behaves identical. Same apply for `iloc` and
>[`aloc`](#the-special-indexer-aloc). For example, two pandas.DataFrames `df1` and `df2` with different columns,
>does align columns with `df1.loc[:, :] = df2` , but does **not** with `df1.loc[:] = df2`.
>
>If this is the desired behavior or a bug, i couldn't verify so far. -- Bert Palm
**2D-indexer**
`dios[boolean dios-like]` (as single key) - dios accept boolean 2D-indexer (boolean pandas.Dataframe
or boolean Dios).
Columns and rows from the indexer align with the dios.
This means that only matching columns selected and in this columns rows are selected where
i) indices are match and ii) the value is True in the indexer-bool-dios. There is no difference between
missing indices and present indices, but False values.
Values from unselected rows and columns are dropped, but empty columns are still preserved,
with the effect that the resulting Dios always have the same column dimension than the initial dios.
>**Note:**
>This is the exact same behavior like pandas.DataFrame's handling of 2D-indexer, despite that pandas.DataFrame
>fill numpy.nan's at missing locations and therefore also fill-up, whole missing columns with numpy.nan's.
**setting values**
Setting values with `[]` and `.loc[]`, `.iloc[]` and `.at[]`, `.iat[]` works like in pandas.
With `.at`/`.iat` only single items can be set, for the other the
right hand side values can be:
- *scalars*: these are broadcasted to the selected positions
- *lists*: the length the list must match the number of indexed columns. The items can be everything that
can applied to a series, with the respective indexing method (`loc`, `iloc`, `[]`).
- *dios*: the length of the columns must match the number of indexed columns - columns does *not* align,
they are just iterated.
Rows do align. Rows that are present on the right but not on the left are ignored.
Rows that are present on the left (bear in mind: these rows was explicitly chosen for write!), but not present
on the right, are filled with `NaN`s, like in pandas.
- *pandas.Series*: column indexer must be a scalar(!), the series is passed down, and set with `loc`, `iloc` or `[]`
by pandas Series, where it maybe align, depending on the method.
**Examples:**
- `dios.loc[2:5, 'a'] = [1,2,3]` is the same as `a=dios['a']; a.loc[2:5]=[1,2,3]; dios['a']=a`
- `dios.loc[2:5, :] = 99` : set 99 on rows 2 to 5 on all columns
The special indexer `.aloc`
-----------------------------
Additional to the pandas like indexers we have a `.aloc[..]` (align locator) indexing method.
Unlike `.iloc` and `.loc` indexers fully align if possible and 1D-array-likes can be broadcast
to multiple columns at once. This method also handle missing indexer-items gracefully.
It is used like `.loc`, so a single indexer (`.aloc[indexer]`) or a tuple of row-indexer and
column-indexer (`.aloc[row-indexer, column-indexer]`) can be given. Also it can handle boolean and *non-bolean*
2D-Indexer.
For more information and examples see the [aloc usage](/docs/aloc_usage.md) and the [cookbook](docs/cookbook.md).
Properties
----------
See also the [Properties documentation](/docs/methods_and_properties.md#properties)
>**Note:**
>
> Properties that are also implemented in pandas.DataFrame, mostly work analogous in dios.DictOfSeries.
- columns
- indexes
- lengths
- values
- dtypes
- itype
- empty
- size
Methods and implied features
-------
See also the [Methods documentation](/docs/methods_and_properties.md#methods)
>**Note:**
>
> Methods that are also implemented in pandas.DataFrame, mostly work analogous in dios.DictOfSeries.
- `copy()`
- `copy_empty()`
- `all()`
- `any()`
- `squeeze()`
- `to_df()`
- `to_string()`
- `apply()`
- `astype()`
- `isin()`
- `isna()`
- `notna()`
- `dropna()`
- `memory_usage()`
- `index_of()`
- `in`
- `is`
- `len(Dios)`
Operators and Comparators
---------
- arithmetical: `+ - * ** // / %` and `abs()`
- boolean: `&^|~`
- comparators: `== != > >= < <=`
Itype
-----
DictOfSeries holds multiple series, and each series can have a different index length
and index type. Differing index lengths are either solved by some aligning magic, or simply fail, if
aligning makes no sense (eg. assigning the very same list to series of different lengths (see `.aloc`).
A bigger challange is the type of the index. If one series has an alphabetical index, and another one
a numeric index, selecting along columns can fail in every scenario. To keep track of the
types of index or to prohibit the inserting of a *not fitting* index type,
we introduce the `itype`. This can be set on creation of a Dios and also changed during usage.
On change of the itype, all indexes of all series in the dios are casted to a new fitting type,
if possible. Different cast-mechanisms are available.
If an itype prohibits some certain types of indexes and a series with a non-fitting index-type is inserted,
an implicit type cast is done (with or without a warning) or an error is raised. The warning/error policy
can be adjusted via global options.
Have fun :)
Itype
=====
\ No newline at end of file
=====
DictOfSeries holds multiple series, and each series can have a different index length
and index type. Differing index lengths are either solved by some aligning magic, or simply fail, if
aligning makes no sense (eg. assigning the very same list to series of different lengths (see `.aloc`).
A bigger challange is the type of the index. If one series has an alphabetical index, and another one
a numeric index, selecting along columns can fail in every scenario. To keep track of the
types of index or to prohibit the inserting of a *not fitting* index type,
we introduce the `itype`. This can be set on creation of a Dios and also changed during usage.
On change of the itype, all indexes of all series in the dios are casted to a new fitting type,
if possible. Different cast-mechanisms are available.
If an itype prohibits some certain types of indexes and a series with a non-fitting index-type is inserted,
an implicit type cast is done (with or without a warning) or an error is raised. The warning/error policy
can be adjusted via global options.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment