Skip to content
Snippets Groups Projects
This project is mirrored from https://github.com/run-llama/llama_index. Pull mirroring updated .
  1. Feb 17, 2024
  2. Feb 16, 2024
  3. Feb 15, 2024
    • Dale McDiarmid's avatar
      ClickHouse as a vector store (#10583) · 9cc433aa
      Dale McDiarmid authored
      9cc433aa
    • preemoDez's avatar
      Fix the perf issue in building nodes from splits. (#10766) · f2d9472e
      preemoDez authored
      * Fix the perf issue in building nodes from splits.
      
      Create the `relationships` object only once. Otherwise, it recomputes the whole text's hash for every node. It is very inefficient for long text.
      
      An alternative approach would be to cache the hash property. However, it wasn't so straightforward as `Document` isn't a cacheable type. I also do not know Python very well, maybe it would be enough to store a simple null and if it isn't null, then don't recompute? However, the most important reason is I'm not sure about the side effects and the existing assumption that the node is mutable and the hash always reflects the state during the call (unless we modify the object in multiple threads). This change doesn't break any assumptions. If the document was modified while we were creating nodes extracted from it, something would be very wrong.
      
      Benchmarks taken on a document attached to the bug:
      
      Before: Execution time for build_nodes_from_splits: 53.69 seconds
      
      After: Execution time for build_nodes_from_splits: 0.18 seconds
      
      * Fix the formatting
      f2d9472e
    • abhiram1809's avatar
      c18b3234
  4. Feb 14, 2024
  5. Feb 13, 2024
  6. Feb 12, 2024
Loading