GitHub - sewcio543/soupsavvy: Powerful and flexible web scraping Search Engine

Powerful and flexible web scraping Search Engine

About

With many web scraping libraries available, each with unique interfaces and conventions, soupsavvy provides conistent and easy way of building selection workflows.

With soupsavvy, developers can focus on data extraction workflows instead of wrestling with library-specific quirks and inconsistencies. Eliminate complexity and introduce scalability and maintainability to your web scraping projects.

Key Features

soupsavvy introduces the concept of Selector, a declarative, consistent, and reusable search procedure that is based on following principles:

Decoupling: Selection logic is abstracted away from DOM node and traversal implementations.
Framework-Agnostic: Operates consistently with any supported library.
Flexibile & Extensibile: Lightweight, reusable components used to build complex selection workflows.

Installation

soupsavvy is published on PyPi and can be installed via pip:

pip install soupsavvy

Documentation

Full documentation can be found at documentation.

Demos

For more information about the package, its concepts and usage, read Demos section of the documentation. It's step by step guide to the most important features of the package.

Contributing

If you'd like to contribute to soupsavvy, feel free to check out the GitHub repository and submit pull requests into one of development branches. Any feedback, bug reports, or feature requests are welcome! In case of any doubts, follow Contribution Guidelines

License

soupsavvy is licensed under , allowing for both personal and commercial use. See the LICENSE file for more information.

Happy scraping! ✨

Name		Name	Last commit message	Last commit date
Latest commit History 579 Commits
.github		.github
demos		demos
docs		docs
resources		resources
soupsavvy		soupsavvy
tests		tests
.bumpversion.cfg		.bumpversion.cfg
.coveragerc		.coveragerc
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
test.md		test.md
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Powerful and flexible web scraping Search Engine

Table of Contents

About

Key Features

Installation

Documentation

Demos

Contributing

License

About

Releases 7

Packages

Contributors 3

Languages

License

sewcio543/soupsavvy

Folders and files

Latest commit

History

Repository files navigation

Powerful and flexible web scraping Search Engine

Table of Contents

About

Key Features

Installation

Documentation

Demos

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 3

Languages

Packages