Add tokenizer improvments via Singleton class and estimation (#3072)
* Add tokenizer improvments via Singleton class linting * dev build * Estimation fallback when string exceeds a fixed byte size * Add notice to tiktoken on backend
Showing
- .github/workflows/dev-build.yaml 1 addition, 1 deletion.github/workflows/dev-build.yaml
- collector/processLink/convert/generic.js 1 addition, 1 deletioncollector/processLink/convert/generic.js
- collector/processRawText/index.js 1 addition, 1 deletioncollector/processRawText/index.js
- collector/processSingleFile/convert/asAudio.js 1 addition, 1 deletioncollector/processSingleFile/convert/asAudio.js
- collector/processSingleFile/convert/asDocx.js 1 addition, 1 deletioncollector/processSingleFile/convert/asDocx.js
- collector/processSingleFile/convert/asEPub.js 1 addition, 1 deletioncollector/processSingleFile/convert/asEPub.js
- collector/processSingleFile/convert/asMbox.js 1 addition, 1 deletioncollector/processSingleFile/convert/asMbox.js
- collector/processSingleFile/convert/asOfficeMime.js 1 addition, 1 deletioncollector/processSingleFile/convert/asOfficeMime.js
- collector/processSingleFile/convert/asPDF/index.js 1 addition, 1 deletioncollector/processSingleFile/convert/asPDF/index.js
- collector/processSingleFile/convert/asTxt.js 1 addition, 1 deletioncollector/processSingleFile/convert/asTxt.js
- collector/processSingleFile/convert/asXlsx.js 1 addition, 1 deletioncollector/processSingleFile/convert/asXlsx.js
- collector/utils/extensions/Confluence/index.js 1 addition, 1 deletioncollector/utils/extensions/Confluence/index.js
- collector/utils/extensions/RepoLoader/GithubRepo/index.js 1 addition, 1 deletioncollector/utils/extensions/RepoLoader/GithubRepo/index.js
- collector/utils/extensions/RepoLoader/GitlabRepo/index.js 1 addition, 1 deletioncollector/utils/extensions/RepoLoader/GitlabRepo/index.js
- collector/utils/extensions/WebsiteDepth/index.js 1 addition, 1 deletioncollector/utils/extensions/WebsiteDepth/index.js
- collector/utils/extensions/YoutubeTranscript/index.js 1 addition, 1 deletioncollector/utils/extensions/YoutubeTranscript/index.js
- collector/utils/tokenizer/index.js 59 additions, 8 deletionscollector/utils/tokenizer/index.js
- server/utils/AiProviders/deepseek/index.js 4 additions, 1 deletionserver/utils/AiProviders/deepseek/index.js
- server/utils/helpers/tiktoken.js 46 additions, 4 deletionsserver/utils/helpers/tiktoken.js
Loading
Please register or sign in to comment