Skip to content

v1.4.4

Latest
Compare
Choose a tag to compare
@nickscamara nickscamara released this 14 Feb 16:03

🚀 Features & Enhancements

  • Scrape API: Added action & wait time validation (#1146)
  • Extraction Improvements:
    • Added detection of PDF/image sub-links & extracted text via Gemini (#1173)
    • Multi-entity prompt enhancements for extraction (#1181)
    • Show sources out of __experimental in extraction (#1180)
  • Environment Setup: Added Serper & Search API env vars to docker-compose (#1147)
  • Credit System Update: Now displays "tokens" instead of "credits" when out of tokens (#1178)

✏️ Examples

🐛 Fixes

  • HTML Transformer: Updated free_string function parameter type (#1163)
  • Gemini Crawler: Updated library & improved PDF link extraction (#1175)
  • Crawl Queue Worker: Only reports successful page count in num_docs (#1179)
  • Scraping & URLs:
    • Fixed relative URL conversion (#584)
    • Enforced scrape rate limit in batch scraping (#1182)

What's Changed

  • [FIR-796] feat(api/types): Add action and wait time validation for scrape requests by @ftonato in #1146
  • Implemented Gemini 2.0 crawler by @aparupganguly in #1161
  • Add Serper and Search API env vars to docker-compose by @RealLukeMartin in #1147
  • fix(html-transformer): Update free_string function parameter type by @carterlasalle in #1163
  • Add detection of PDF/image sub-links and extract text via Gemini by @mayooear in #1173
  • fix: update gemini library. extract pdf links from scraped content by @mayooear in #1175
  • feat(v1/checkCredits): say "tokens" instead of "credits" if out of tokens by @mogery in #1178
  • feat(v1/extract) Show sources out of __experimental by @nickscamara in #1180
  • (feat/extract) Multi-entity prompt improvements by @nickscamara in #1181
  • fix(queue-worker/crawl): only report successful page count in num_docs (FIR-960) by @mogery in #1179
  • fix: relative url 2 full url use error base url by @dolonfly in #584
  • fix(v1/batch/scrape): use scrape rate limit by @mogery in #1182

New Contributors

Full Changelog: v1.4.3...v1.4.4