With many web scraping libraries available, each with unique interfaces and conventions, soupsavvy
provides conistent and easy way of building selection workflows.
With soupsavvy
, developers can focus on data extraction workflows instead of wrestling with library-specific quirks and inconsistencies. Eliminate complexity and introduce scalability and maintainability to your web scraping projects.
soupsavvy
introduces the concept of Selector
, a declarative, consistent, and reusable search procedure that is based on following principles:
- Decoupling: Selection logic is abstracted away from DOM node and traversal implementations.
- Framework-Agnostic: Operates consistently with any supported library.
- Flexibile & Extensibile: Lightweight, reusable components used to build complex selection workflows.
soupsavvy
is published on PyPi and can be installed via pip:
pip install soupsavvy
Full documentation can be found at documentation.
For more information about the package, its concepts and usage, read Demos
section of the documentation. It's step by step guide to the most important features of the package.
If you'd like to contribute to soupsavvy, feel free to check out the GitHub repository and submit pull requests into one of development branches. Any feedback, bug reports, or feature requests are welcome! In case of any doubts, follow Contribution Guidelines
soupsavvy
is licensed under , allowing for both personal and commercial use. See the
LICENSE
file for more information.
Happy scraping! ✨