Skip to content

Getting Started

Getting Started with Data Package Manager (DPM)

To get started with the Data Package Manager, you can install it using:

pip install dpm

DPM simplifies the process of installing (downloading) data resources described in Frictionless data package descriptors and provides a set of functionalities to manage them.

If you have a valid data package descriptor, such as a datapackage.json file, you can use the command dpm install to download its data sources in a structured way to your local machine.

Step 1: Create a Configuration File

Begin by creating a file named data.toml in the root of your project, containing the metadata for the data packages you want to use:

# file: data.toml
[packages]

[packages.your_datapackage_name]
path = "https://raw.githubusercontent.com/your-org/your_repo/datapackage.json"
token = "your_github_pat_if_needed_to_access_private_repositories"

Step 2: Install the Data Packages

Next, run the following command:

dpm install

This command will access the datapackage.json located at the specified URL in your data.toml file. The resources described in the datapackage.json will be downloaded and saved in the datapackages folder by default.

For each resource, a subfolder named your_datapackage_name will be created, and the data package descriptor will also be downloaded:

.
└── your_project_name/
    ├── datapackages/
    │   ├── your_datapackage_name/
    │   │   ├── resource1.csv
    │   │   └── resource2.xlsx
    │   └── datapackage.json
    ├── README.md
    ├── data.toml
    └── main.py