Code changes:
- Remove references to deprecated NumPy types (#69 - thanks @BlackSpade741!)
- Switch from cChardet to charset-normalizer for Python 3.10 support (#76 - thanks @brockhaywood!)
Other changes:
- Miscellaneous improvements to tests, code formatting, and documentation (#61 - thanks @invisiblefunnel!)
- Relocate usage examples from wiki to README (#70 - thanks @landonreed!)
- README tweaks (#74 - thanks @chelsey!)
- Use GitHub Actions for automated testing (#79 - thanks @dget!). Note: we now test against Python versions 3.8, 3.9, 3.10, and 3.11.
- Improve file encoding sniffer, which was misidentifying some Finnish/emoji unicode. Thanks to @dyakovlev!
- Add
partridge.load_geo_feed
for reading stops and shapes into GeoPandas GeoDataFrames.
This release is a combination of major internal refactorings and some minor interface changes. Overall, you should expect your upgrade from pre-1.0 versions to be relatively painless. A big thank you to @genhernandez and @csb19815 for their valuable design feedback. If you still need Python 2 support, please continue using version 0.11.0.
Here is a list of interface changes:
- The class
partridge.gtfs.feed
has been renamed topartridge.gtfs.Feed
. - The public interface for instantiating feeds is
partridge.load_feed
. This function replaces the previously undocumented functionpartridge.get_filtered_feed
. - A new function has been added for identifying the busiest week in a feed:
partridge.read_busiest_date
- The public function
partridge.get_representative_feed
has been removed in favor of usingpartridge.read_busiest_date
directly. - The public function
partridge.writers.extract_feed
is now available via the top level module:partridge.extract_feed
.
Miscellaneous minor changes:
- Character encoding detection is now done by the
cchardet
package instead ofchardet
.cchardet
is faster, but may not always return the same result aschardet
. - Zip files are unpacked into a temporary directory instead of reading directly from the zip. These temporary directories are cleaned up when the feed is garbage collected or when the process exits.
- The code base is now annotated with type hints and the build runs
mypy
to verify the types. - DataFrames are cached in a dictionary instead of the
functools.lru_cache
decorator. - The
partridge.extract_feed
function now writes files concurrently to improve performance.
- Fix major performance issue related to encoding detection. Thank you to @cjer for reporting the issue and advising on a solution.
- Improved handling of non-standard compliant file encodings
- Only require functools32 for Python < 3
ptg.parsers.parse_date
no longer accepts dates, only strings
- Improves read time for large feeds by adding LRU caching to
ptg.parsers.parse_time
.
- Gracefully handle completely empty files. This change unifies the behavior of reading from a CSV with a header only (no data rows) and a completely empty (zero bytes) file in the zip.
- Fix handling of nested folders and zip containing nested folders.
- Add
ptg.get_filtered_feed
for multi-file filtering.
- Fix bug in
ptg.read_service_ids_by_date
. Reported by @cjer in #27.
- Published package no longer includes unnecessary fixtures to reduce the size.
- Naively write a feed object to a zip file with
ptg.write_feed_dangerously
. - Read the earliest, busiest date and its
service_id
's from a feed withptg.read_busiest_date
. - Bug fix: Handle
calendar.txt
/calendar_dates.txt
entries w/o applicable trips.
- Add support for reading files from a folder. Thanks again @danielsclint!
- Easily build a representative view of a zip with
ptg.get_representative_feed
. Inspired by peartree. - Extract out GTFS zips by agency_id/route_id with
ptg.extract_{agencies,routes}
. - Read arbitrary files from a zip with
feed.get('myfile.txt')
. - Remove
service_ids_by_date
,dates_by_service_ids
, andtrip_counts_by_date
from the feed class. Instead useptg.{read_service_ids_by_date,read_dates_by_service_ids,read_trip_counts_by_date}
.
- Add support for Python 2.7. Thanks @danielsclint!
- Fix service date resolution for raw_feed. Previously raw_feed considered all days of the week from calendar.txt to be active regardless of 0/1 value.
- Add missing edge from fare_rules.txt to routes.txt in default dependency graph.
- First release on PyPI.