graviti.manager.dataset#

The implementation of the Dataset and DatasetManager.

Module Contents#

Classes#

RevisionType

RevisionType is an enumeration type including "BRANCH", "COMMIT" and "TAG".

ObjectPermissionManagerType

ObjectPermissionManagerType is an enumeration type including "OSS", "S3" and "AZURE".

Dataset

This class defines the basic concept of the dataset on Graviti.

DatasetManager

This class defines the operations on the dataset on Graviti.

Attributes#

graviti.manager.dataset.logger[source]#
graviti.manager.dataset.handler[source]#
class graviti.manager.dataset.RevisionType[source]#

Bases: enum.Enum

RevisionType is an enumeration type including “BRANCH”, “COMMIT” and “TAG”.

class graviti.manager.dataset.ObjectPermissionManagerType[source]#

Bases: enum.Enum

ObjectPermissionManagerType is an enumeration type including “OSS”, “S3” and “AZURE”.

class graviti.manager.dataset.Dataset(workspace, response)[source]#

Bases: graviti.utility.UserMutableMapping[str, graviti.dataframe.DataFrame], graviti.utility.ReprMixin

This class defines the basic concept of the dataset on Graviti.

Parameters
  • workspace (graviti.Workspace) – Class Workspace instance.

  • response (Dict[str, Any]) –

    The response of the OpenAPI associated with the dataset:

    {
        "id": <str>
        "name": <str>
        "alias": <str>
        "workspace": <str>
        "default_branch": <str>
        "commit_id": <Optional[str]>
        "cover_url": <str>
        "creator": <str>
        "created_at": <str>
        "updated_at": <str>
        "is_public": <bool>
        "storage_config": <str>
        "backend_type": <str>
    }
    

name#

The name of the dataset, unique for a user.

alias#

Dataset alias.

workspace#

The workspace of the dataset.

default_branch#

The default branch of dataset.

commit_id#

The commit ID of the dataset.

creator#

The creator of the dataset.

created_at#

The time when the dataset was created.

updated_at#

The time when the dataset was last modified.

is_public#

Whether the dataset is public.

storage_config#

The storage config of dataset.

property HEAD(self)[source]#

Return the current branch or commit.

Returns

The current branch or commit.

Return type

graviti.manager.commit.Commit

property branches(self)[source]#

Get class BranchManager instance.

Returns

Required BranchManager instance.

Return type

graviti.manager.branch.BranchManager

property drafts(self)[source]#

Get class DraftManager instance.

Returns

Required DraftManager instance.

Return type

graviti.manager.draft.DraftManager

property commits(self)[source]#

Get class CommitManager instance.

Returns

Required CommitManager instance.

Return type

graviti.manager.commit.CommitManager

property tags(self)[source]#

Get class TagManager instance.

Returns

Required TagManager instance.

Return type

graviti.manager.tag.TagManager

property searches(self)[source]#

Get class SearchManager instance.

Returns

Required SearchManager instance.

Return type

graviti.manager.search.SearchManager

property actions(self)[source]#

Get class ActionManager instance.

Returns

Required ActionManager instance.

Return type

graviti.manager.action.ActionManager

checkout(self, revision)[source]#

Checkout to a commit.

Parameters

revision (str) – The information to locate the specific commit, which can be the commit id, the branch, or the tag.

Return type

None

edit(self, *, name=None, alias=None, default_branch=None)[source]#

Update the meta data of the dataset.

Parameters
  • name (Optional[str]) – The new name of the dataset.

  • alias (Optional[str]) – The new alias of the dataset.

  • default_branch (Optional[str]) – The new default branch of the dataset.

Return type

None

commit(self, title, description=None, jobs=8, quiet=False)[source]#

Create, upload and commit the draft to push the local dataset to Graviti.

Parameters
  • title (str) – The commit title.

  • description (Optional[str]) – The commit description.

  • jobs (int) – The number of the max workers in multi-thread upload, the default is 8.

  • quiet (bool) – Set to True to stop showing the upload process bar.

Raises
  • StatusError – When the HEAD of the dataset is not a branch.

  • StatusError – When the dataset has no modifications.

Return type

None

class graviti.manager.dataset.DatasetManager(workspace)[source]#

This class defines the operations on the dataset on Graviti.

Parameters

workspace (graviti.Workspace) – Class Workspace instance.

create(self, name, alias='', storage_config=None)[source]#

Create a Graviti dataset with given name.

Parameters
  • name (str) – The name of the dataset, unique for a user.

  • alias (str) – Alias of the dataset, default is “”.

  • storage_config (Optional[str]) – The auth storage config name.

Returns

The created Dataset instance.

Return type

Dataset

get(self, name)[source]#

Get a Graviti dataset with given name.

Parameters

name (str) – The name of the dataset, unique for a user.

Returns

The requested Dataset instance.

Raises

ResourceNameError – When the required dataset does not exist.

Return type

Dataset

list(self)[source]#

List Graviti datasets.

Returns

The LazyPagingList of Dataset instances.

Return type

graviti.manager.lazy.LazyPagingList[Dataset]

delete(self, name)[source]#

Delete a Graviti dataset with given name.

Parameters

name (str) – The name of the dataset, unique for a user.

Return type

None