graviti.paging.factory
#
Paging list related class.
Module Contents#
Classes#
LazyFactoryBase is the base class of the lazy facotry. |
|
LazyFactory is a factory for requesting source data and creating paging lists. |
|
LazySubFactory is a factory for creating paging lists. |
|
LazyLowerCaseFactory is a factory to handle the case insensitive data from graviti back-end. |
|
LazyLowerCaseSubFactory is a sub-factory to handle the case insensitive data. |
- class graviti.paging.factory.LazyFactoryBase[source]#
LazyFactoryBase is the base class of the lazy facotry.
- abstract create_list(self, mapper)[source]#
Create a paging list from the factory.
- Parameters
mapper (Callable[[Any], _T]) – A callable object to convert every item in the pyarrow array.
- Raises
NotImplementedError – The method of the base class should not be called.
- Return type
- abstract create_mapped_list(self, mapper)[source]#
Create a paging list from the factory.
- Parameters
mapper (Callable[[Any], _T]) – A callable object to convert every item in the pyarrow array.
- Raises
NotImplementedError – The method of the base class should not be called.
- Return type
- class graviti.paging.factory.LazyFactory(total_count, limit, getter, patype)[source]#
Bases:
LazyFactoryBase
LazyFactory is a factory for requesting source data and creating paging lists.
- Parameters
total_count (int) – The total count of the elements in the paging lists.
limit (int) – The size of each lazy load page.
getter (Callable[[int, int], Any]) – A callable object to get the source data.
patype (pyarrow.DataType) – The pyarrow DataType of the data in the factory.
Examples
>>> import pyarrow as pa >>> patype = pa.struct( ... { ... "remotePath": pa.string(), ... "label": pa.struct({"CLASSIFICATION": pa.struct({"category": pa.string()})}), ... } ... ) >>> TOTAL_COUNT = 1000 >>> def getter(offset: int, limit: int) -> List[Dict[str, Any]]: ... stop = min(offset + limit, TOTAL_COUNT) ... return [ ... { ... "remotePath": f"{i:06}.jpg", ... "label": {"CLASSIFICATION": {"category": "cat" if i % 2 else "dog"}}, ... } ... for i in range(offset, stop) ... ] ... >>> factory = LazyFactory(TOTAL_COUNT, 128, getter, patype) >>> paths = factory["remotePath"].create_pyarrow_list() >>> categories = factory["label"]["CLASSIFICATION"]["category"].create_pyarrow_list() >>> len(paths) 1000 >>> list(paths) [<pyarrow.StringScalar: '000000.jpg'>, <pyarrow.StringScalar: '000001.jpg'>, <pyarrow.StringScalar: '000002.jpg'>, <pyarrow.StringScalar: '000003.jpg'>, <pyarrow.StringScalar: '000004.jpg'>, <pyarrow.StringScalar: '000005.jpg'>, ... ... <pyarrow.StringScalar: '000999.jpg'>] >>> len(categories) 1000 >>> list(categories) [<pyarrow.StringScalar: 'dog'>, <pyarrow.StringScalar: 'cat'>, <pyarrow.StringScalar: 'dog'>, <pyarrow.StringScalar: 'cat'>, <pyarrow.StringScalar: 'dog'>, ... ... <pyarrow.StringScalar: 'cat'>]
- get_array(self, pos, keys)[source]#
Get the array from the factory.
- Parameters
pos (int) – The page number.
keys (Tuple[str, Ellipsis]) – The keys to access the array from factory.
- Returns
The requested pyarrow array.
- Return type
pyarrow.Array
- create_list(self, mapper)[source]#
Create a paging list from the factory.
- Parameters
mapper (Callable[[Any], _T]) – A callable object to convert every item in the pyarrow array.
- Returns
A paging list created from the factory.
- Return type
- create_mapped_list(self, mapper)[source]#
Create a paging list from the factory.
- Parameters
mapper (Callable[[Any], _T]) – A callable object to convert every item in the pyarrow array.
- Returns
A paging list created from the factory.
- Return type
- create_pyarrow_list(self)[source]#
Create a paging list from the factory.
- Returns
A paging list created from the factory.
- Return type
- get_page_lengths(self)[source]#
A Generator which generates the length of the pages in the factory.
- Yields
The page lengths.
- Return type
Iterator[int]
- class graviti.paging.factory.LazySubFactory(factory, keys, patype)[source]#
Bases:
LazyFactoryBase
LazySubFactory is a factory for creating paging lists.
- Parameters
factory (LazyFactory) – The source LazyFactory instance.
keys (Tuple[str, Ellipsis]) – The keys to access the array from the source LazyFactory.
patype (pyarrow.DataType) – The pyarrow DataType of the data in the sub-factory.
- create_list(self, mapper)[source]#
Create a paging list from the factory.
- Parameters
mapper (Callable[[Any], _T]) – A callable object to convert every item in the pyarrow array.
- Returns
A paging list created from the factory.
- Return type
- create_mapped_list(self, mapper)[source]#
Create a paging list from the factory.
- Parameters
mapper (Callable[[Any], _T]) – A callable object to convert every item in the pyarrow array.
- Returns
A paging list created from the factory.
- Return type
- class graviti.paging.factory.LazyLowerCaseFactory(total_count, limit, getter, patype)[source]#
Bases:
LazyFactory
LazyLowerCaseFactory is a factory to handle the case insensitive data from graviti back-end.
- Parameters
total_count (int) – The total count of the elements in the paging lists.
limit (int) – The size of each lazy load page.
getter (Callable[[int, int], Any]) – A callable object to get the source data.
patype (pyarrow.DataType) – The pyarrow DataType of the data in the factory.
- class graviti.paging.factory.LazyLowerCaseSubFactory(factory, keys, patype)[source]#
Bases:
LazySubFactory
LazyLowerCaseSubFactory is a sub-factory to handle the case insensitive data.
- Parameters
factory (LazyFactory) – The source LazyFactory instance.
keys (Tuple[str, Ellipsis]) – The keys to access the array from the source LazyFactory.
patype (pyarrow.DataType) – The pyarrow DataType of the data in the sub-factory.