Skip to content

This repository includes the code to download the curated HuggingFace papers into a single markdown formatted file

License

Notifications You must be signed in to change notification settings

elsatch/daily_hf_papers_abstracts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

HuggingFace Daily Papers Abstracts Extractor

This project automates the process of downloading, summarizing, and converting daily papers from Hugging Face into easily readable formats.

Sample output of abstract extraction process

Features

  • Download daily papers from Hugging Face API
  • Extract abstracts and generate markdown summaries
  • Handle empty files and weekends/holidays
  • Avoid reprocessing existing files

Project Structure

hf_daily_papers/
│
├── data/
│   ├── input/  # Downloaded JSON files
│   ├── output/ # Generated markdown files
│
├── src/
│   ├── download_daily_papers.py
│   ├── daily_papers_abstract_extractor.py
│
└── README.md

Installation

  1. Clone this repository:

    git clone https://github.com/elsatch/daily_hf_papers_abstracts.git
    cd hf_daily_papers
    
  2. Install the required dependencies:

    pip install requests
    

Usage

  1. Download daily papers:

    python src/download_daily_papers.py [YYYYMMDD]
    

    If no date is provided, it will download papers for the current date.

  2. Process JSON files and generate markdown summaries:

    python src/daily_papers_abstract_extractor.py
    

Notes

  • The scripts handle empty files that may occur during weekends or holidays.
  • Existing processed files are not overwritten to avoid unnecessary reprocessing.
  • You can run these scripts daily to keep up with the latest papers.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is open source and available under the MIT License.

About

This repository includes the code to download the curated HuggingFace papers into a single markdown formatted file

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages