This project is mirrored from https://github.com/Mintplex-Labs/anything-llm.
Pull mirroring updated .
- Dec 30, 2024
-
-
Sean Hatfield authored
* add audio file validations * patch sharp to support wavfile parsing --------- Co-authored-by:
timothycarambat <rambat1010@gmail.com>
-
- Dec 11, 2024
-
-
Sean Hatfield authored
* fix scraping failed bug in link/bulk link scrapers * reset submodule * swap to networkidle2 as a safe mix for SPA and API-loaded pages, but also not hang on request heavy pages * lint --------- Co-authored-by:
timothycarambat <rambat1010@gmail.com>
-
- Nov 20, 2024
-
-
Timothy Carambat authored
-
- Nov 12, 2024
-
-
Sean Hatfield authored
remove openai whisper transcription provider response_format option
-
- Oct 31, 2024
-
-
Sean Hatfield authored
* allow 127.0.0.1 as valid url for scraping * update comments and lint --------- Co-authored-by:
timothycarambat <rambat1010@gmail.com>
-
- Oct 28, 2024
-
-
timothycarambat authored
-
- Oct 21, 2024
-
-
Sean Hatfield authored
* fix tree/blob github urls from branches not being loaded * improve ux of github data connector * lint * patch Github URL parser to just validate with `URL` native parser * uncheck LocalStorage of PAT for security reasons --------- Co-authored-by:
Timothy Carambat <rambat1010@gmail.com>
-
- Oct 18, 2024
-
-
timothycarambat authored
-
Sean Hatfield authored
handle non-ascii characters in urls
-
- Oct 03, 2024
-
-
Sean Hatfield authored
* support xlsx files * lint * create seperate docs for each xlsx sheet * lint * use node-xlsx pkg for parsing xslx files * lint * update error handling --------- Co-authored-by:
timothycarambat <rambat1010@gmail.com>
-
- Oct 02, 2024
-
-
Timothy Carambat authored
-
- Sep 30, 2024
-
-
Blazej Owczarczyk authored
-
- Sep 26, 2024
-
-
Timothy Carambat authored
* Updated the `GitHubRepoLoader` class to use the new import syntax and adjust the `recursiveLoader` method accordingly. * add @langchain/community to collector package.json * fix: Improve handling of complex ignore patterns in GitLabRepoLoader * refactor: use ignore package for simplified ignore logic * run yarn lint * add @langchain/community@^0.2.23 * remove unused dep lint --------- Co-authored-by:
Emil Rofors (aider) <emirof@gmail.com>
-
Blazej Owczarczyk authored
* Added an option to fetch issues from gitlab. Made the file fetching asynchornous to improve performance. #2334 * Fixed a typo in loadGitlabRepo. * Convert issues to markdown. * Fixed an issue with time estimate field names in issueToMarkdown. * handle rate limits more gracefully + update checkbox to toggle switch * lint --------- Co-authored-by:
Timothy Carambat <rambat1010@gmail.com> Co-authored-by:
shatfield4 <seanhatfield5@gmail.com>
-
Timothy Carambat authored
-
- Sep 25, 2024
-
-
Sean Hatfield authored
* support more confluence url formats * use pattern matching for confluence urls and manual splitting as fallback * rework entire Confluence flow to prevent issues with custom, local, and cloud spaces * remove dep --------- Co-authored-by:
Timothy Carambat <rambat1010@gmail.com>
-
- Sep 19, 2024
-
-
Sean Hatfield authored
confluence custom domain fix
-
Timothy Carambat authored
* Fix gitlab data connector for self-hosted instances (#2315) * Linting fix. * Load all branches in the GitLab data connector #2319 * #2319 lint fixes. * update fetch on fail --------- Co-authored-by:
Błażej Owczarczyk <blazeyy@gmail.com>
-
- Sep 18, 2024
-
-
Blazej Owczarczyk authored
* Fix gitlab data connector for self-hosted instances (#2315) * Linting fix.
-
- Sep 09, 2024
-
-
timothycarambat authored
connect #2243
-
- Sep 06, 2024
-
-
timothycarambat authored
-
- Aug 14, 2024
-
-
timothycarambat authored
resolves #2114
-
timothycarambat authored
-
- Aug 12, 2024
-
-
Sean Hatfield authored
fix depth handling in bulk link scraper
-
- Aug 10, 2024
-
-
Lea Anthony authored
Support Go filetype
-
- Aug 06, 2024
-
-
Mehmet Ünlü authored
fix: remove unnecessary break Remove unnecessary break that prevents checking next pages for blob objects.
-
Sean Hatfield authored
youtube loader whitespace fix
-
- Jul 25, 2024
-
-
Timothy Carambat authored
* Remove unused deps * improve dependency
-
- Jul 23, 2024
-
-
Timothy Carambat authored
* Add support for GitLab repo collection as well as Github Repo collection * Refactor for repo collectors to be more compact --------- Co-authored-by:
Emil Rofors <emirof@gmail.com>
-
- Jul 20, 2024
-
-
timothycarambat authored
-
timothycarambat authored
-
- Jul 16, 2024
-
-
Sean Hatfield authored
use pdf.js by importing it from pdf-parse and fix custom PDFLoader module
-
- Jul 11, 2024
-
-
Sean Hatfield authored
* implement custom PDFLoader to remove LC dep * remove unneeded comment * remove pdfjs as dep and fix page splitting using pdf-parse * linting + export rename for desktop compat --------- Co-authored-by:
timothycarambat <rambat1010@gmail.com>
-
- Jul 04, 2024
-
-
timothycarambat authored
-
- Jul 03, 2024
-
-
Timothy Carambat authored
-
Sean Hatfield authored
* WIP replace langchain pdfloader with pdfjs and add more context to each page * remove extras from pdfjs and just replace langchain library * remove unneeded dep * fix console log in docs --------- Co-authored-by:
timothycarambat <rambat1010@gmail.com>
-
Sean Hatfield authored
implement custom confluence loader to extract code blocks properly from documents Co-authored-by:
Timothy Carambat <rambat1010@gmail.com>
-
- Jul 01, 2024
-
-
Sean Hatfield authored
patch website depth data connector to work for other links that are not root url
-
- Jun 25, 2024
-
-
Jason Zhang authored
* fix: sanitize filename before writing Fixes: https://github.com/Mintplex-Labs/anything-llm/issues/1737 * fixup * fixup
-
- Jun 21, 2024
-
-
Timothy Carambat authored
* wip bg workers for live document sync * Add ability to re-embed specific documents across many workspaces via background queue bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled UI for watching/unwatching docments that are embedded. TODO: UI to easily manage all bg tasks and see run results TODO: UI to enable this feature and background endpoints to manage it * create frontend views and paths Move elements to correct experimental scope * update migration to delete runs on removal of watched document * Add watch support to YouTube transcripts (#1716) * Add watch support to YouTube transcripts refactor how sync is done for supported types * Watch specific files in Confluence space (#1718) Add failure-prune check for runs * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * dual build update copy of alert modals * update job interval * Add support for live-sync of Github files * update copy for document sync feature * hide Experimental features from UI * update docs links * [FEAT] Implement new settings menu for experimental features (#1735) * implement new settings menu for experimental features * remove unused context save bar --------- Co-authored-by:
timothycarambat <rambat1010@gmail.com> * dont run job on boot * unset workflow changes * Add persistent encryption service Relay key to collector so persistent encryption can be used Encrypt any private data in chunkSources used for replay during resync jobs * update jsDOC * Linting and organization * update modal copy for feature --------- Co-authored-by:
Sean Hatfield <seanhatfield5@gmail.com>
-