From 985f3b52bbb1a59ffd73f720e01dfa9f0e4367d1 Mon Sep 17 00:00:00 2001 From: Yi Ding <yi.s.ding@gmail.com> Date: Fri, 21 Jul 2023 09:14:38 -0700 Subject: [PATCH] add README and CONTRIBUTING to dir and remove .turbo --- packages/core/.gitignore | 1 + packages/core/CONTRIBUTING.md | 78 ++++++++++++++++++++++++++++++++++ packages/core/README.md | 80 +++++++++++++++++++++++++++++++++++ 3 files changed, 159 insertions(+) create mode 100644 packages/core/.gitignore create mode 100644 packages/core/CONTRIBUTING.md create mode 100644 packages/core/README.md diff --git a/packages/core/.gitignore b/packages/core/.gitignore new file mode 100644 index 000000000..c72a4fc77 --- /dev/null +++ b/packages/core/.gitignore @@ -0,0 +1 @@ +.turbo diff --git a/packages/core/CONTRIBUTING.md b/packages/core/CONTRIBUTING.md new file mode 100644 index 000000000..e441a4bb1 --- /dev/null +++ b/packages/core/CONTRIBUTING.md @@ -0,0 +1,78 @@ +# Contributing + +## Structure + +This is a monorepo built with Turborepo + +Right now there are two packages of importance: + +packages/core which is the main NPM library llamaindex + +apps/simple is where the demo code lives + +### Turborepo docs + +You can checkout how Turborepo works using the built in [README-turborepo.md](README-turborepo.md) + +## Getting Started + +Install NodeJS. Preferably v18 using nvm or n. + +Inside the LlamaIndexTS directory: + +``` +npm i -g pnpm ts-node +pnpm install +``` + +Note: we use pnpm in this repo, which has a lot of the same functionality and CLI options as npm but it does do some things better in a monorepo, like centralizing dependencies and caching. + +PNPM's has documentation on its [workspace feature](https://pnpm.io/workspaces) and Turborepo had some [useful documentation also](https://turbo.build/repo/docs/core-concepts/monorepos/running-tasks). + +### Running Typescript + +When we publish to NPM we will have a tsc compiled version of the library in JS. For now, the easiest thing to do is use ts-node. + +### Test cases + +To run them, run + +``` +pnpm run test +``` + +To write new test cases write them in packages/core/src/tests + +We use Jest https://jestjs.io/ to write our test cases. Jest comes with a bunch of built in assertions using the expect function: https://jestjs.io/docs/expect + +### Demo applications + +You can create new demo applications in the apps folder. Just run pnpm init in the folder after you create it to create its own package.json + +### Installing packages + +To install packages for a specific package or demo application, run + +``` +pnpm add [NPM Package] --filter [package or application i.e. core or simple] +``` + +To install packages for every package or application run + +``` +pnpm add -w [NPM Package] +``` + +### Docs + +To contribute to the docs, go to the docs website folder and run the Docusaurus instance. + +```bash +cd apps/docs +pnpm install +pnpm start +``` + +That should start a webserver which will serve the docs on https://localhost:3000 + +Any changes you make should be reflected in the browser. If you need to regenerate the API docs and find that your TSDoc isn't getting the updates, feel free to remove apps/docs/api. It will automatically regenerate itself when you run pnpm start again. diff --git a/packages/core/README.md b/packages/core/README.md new file mode 100644 index 000000000..1c2c0dd73 --- /dev/null +++ b/packages/core/README.md @@ -0,0 +1,80 @@ +# LlamaIndex.TS + +Use your own data with large language models (LLMs, OpenAI ChatGPT and others) in Typescript and Javascript. + +## What is LlamaIndex.TS? + +LlamaIndex.TS aims to be a lightweight, easy to use set of libraries to help you integrate large language models into your applications with your own data. + +## Getting started with an example: + +LlamaIndex.TS requries Node v18 or higher. You can download it from https://nodejs.org or use https://nvm.sh (our preferred option). + +In a new folder: + +```bash +export OPEN_AI_KEY="sk-......" # Replace with your key from https://platform.openai.com/account/api-keys +npx tsc –-init # if needed +pnpm install llamaindex +``` + +Create the file example.ts + +```ts +// example.ts +import fs from "fs/promises"; +import { Document, VectorStoreIndex } from "llamaindex"; + +async function main() { + // Load essay from abramov.txt in Node + const essay = await fs.readFile( + "node_modules/llamaindex/examples/abramov.txt", + "utf-8" + ); + + // Create Document object with essay + const document = new Document({ text: essay }); + + // Split text and create embeddings. Store them in a VectorStoreIndex + const index = await VectorStoreIndex.fromDocuments([document]); + + // Query the index + const queryEngine = index.asQueryEngine(); + const response = await queryEngine.aquery( + "What did the author do in college?" + ); + + // Output response + console.log(response.toString()); +} +``` + +Then you can run it using + +```bash +npx ts-node example.ts +``` + +## Core concepts for getting started: + +- [Document](packages/core/src/Node.ts): A document represents a text file, PDF file or other contiguous piece of data. + +- [Node](packages/core/src/Node.ts): The basic data building block. Most commonly, these are parts of the document split into manageable pieces that are small enough to be fed into an embedding model and LLM. + +- [Embedding](packages/core/src/Embedding.ts): Embeddings are sets of floating point numbers which represent the data in a Node. By comparing the similarity of embeddings, we can derive an understanding of the similarity of two pieces of data. One use case is to compare the embedding of a question with the embeddings of our Nodes to see which Nodes may contain the data needed to answer that quesiton. + +- [Indices](packages/core/src/indices/): Indices store the Nodes and the embeddings of those nodes. QueryEngines retrieve Nodes from these Indices using embedding similarity. + +- [QueryEngine](packages/core/src/QueryEngine.ts): Query engines are what generate the query you put in and give you back the result. Query engines generally combine a pre-built prompt with selected Nodes from your Index to give the LLM the context it needs to answer your query. + +- [ChatEngine](packages/core/src/ChatEngine.ts): A ChatEngine helps you build a chatbot that will interact with your Indices. + +- [SimplePrompt](packages/core/src/Prompt.ts): A simple standardized function call definition that takes in inputs and formats them in a template literal. SimplePrompts can be specialized using currying and combined using other SimplePrompt functions. + +## Contributing: + +We are in the very early days of LlamaIndex.TS. If you’re interested in hacking on it with us check out our [contributing guide](CONTRIBUTING.md) + +## Bugs? Questions? + +Please join our Discord! https://discord.com/invite/eN6D2HQ4aX -- GitLab