Getting Started
Getting Started with Data Package Manager (DPM)
To get started with the Data Package Manager, you can install it using:
pip install dpm
DPM simplifies the process of installing (downloading) data resources described in Frictionless data package descriptors and provides a set of functionalities to manage them.
If you have a valid data package descriptor, such as a datapackage.json
file, you can use the command dpm install
to download its data sources in a structured way to your local machine.
Step 1: Create a Configuration File
Begin by creating a file named data.toml
in the root of your project, containing the metadata for the data packages you want to use:
# file: data.toml
[packages]
[packages.your_datapackage_name]
path = "https://raw.githubusercontent.com/your-org/your_repo/datapackage.json"
token = "your_github_pat_if_needed_to_access_private_repositories"
Step 2: Install the Data Packages
Next, run the following command:
dpm install
This command will access the datapackage.json
located at the specified URL in your data.toml
file. The resources described in the datapackage.json
will be downloaded and saved in the datapackages
folder by default.
For each resource, a subfolder named your_datapackage_name
will be created, and the data package descriptor will also be downloaded:
.
└── your_project_name/
├── datapackages/
│ ├── your_datapackage_name/
│ │ ├── resource1.csv
│ │ └── resource2.xlsx
│ └── datapackage.json
├── README.md
├── data.toml
└── main.py