Skip to content

Instantly share code, notes, and snippets.

@ChrisHegarty
ChrisHegarty / gist:508bb1857cb50df0d757f711c81fd740
Created November 1, 2023 14:03
VectorUtilBenchmark.binaryDotProductScalar- #12743
davekim$ sudo /home/chegar/binaries/jdk-21.0.1/bin/java -XX:+UnlockDiagnosticVMOptions -jar lucene/benchmark-jmh/build/benchmarks/lucene-benchmark-jmh-10.0.0-SNAPSHOT.jar .*binaryDotProductScalar.* -psize=1024 -prof 'perfasm:intelSyntax=true'
# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-29
# VM invoker: /home/chegar/binaries/jdk-21.0.1/bin/java
# VM options: -XX:+UnlockDiagnosticVMOptions
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 3 iterations, 3 s each
# Measurement: 5 iterations, 3 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
@ChrisHegarty
ChrisHegarty / vector_dims_test_output.md
Created June 28, 2023 15:06
Output of vector dims test

Test run 1

davekim$ time /home/chegar/binaries/jdk-20.0.1/bin/java  -cp lucene-9.7.0/modules/*:/home/chegar/git/lucene/lucene/core/build/classes/java/test  -Xmx16g -Xms16g  org.apache.lucene.util.hnsw.KnnGraphTester  -dim 1024  -ndoc 2680961  -reindex  -docs vector_search-open_ai_vectors_1024-vectors_dims1024.bin  -maxConn 16  -beamWidthIndex 100
creating index in vector_search-open_ai_vectors_1024-vectors_dims1024.bin-16-100.index
Jun 28, 2023 1:44:34 PM org.apache.lucene.store.MemorySegmentIndexInputProvider <init>
INFO: Using MemorySegmentIndexInput with Java 20; to disable start with -Dorg.apache.lucene.store.MMapDirectory.enableMemorySegments=false
MS 0 [2023-06-28T12:44:34.340877459Z; main]: initDynamicDefaults maxThreadCount=4 maxMergeCount=9
IFD 0 [2023-06-28T12:44:34.355786340Z; main]: init: current segments file is "segments"; deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@7e9a5fbe
IFD 0 [2023-06-28T12:44:34.358595927Z; main]: now delete 0 files: []
@ChrisHegarty
ChrisHegarty / index.png
Last active June 27, 2023 16:28
Accelerating Vector Search with SIMD instructions
index.png

Experimenting with JFR Mirror Events

This writeup describes a number of observations while performing an experiment to determine the feasibility of using JFR mirror events in core parts of the JDK, in particular, the Networking APIs. Mirror events are an internal mechanism used by the JDK that does not depend on the jdk.jfr module, currently the only mirror events are defined in java.base. Since mirror events are only available within the JDK, much of what is discussed within is directly relevant to developers working on the JDK (rather than application or library developers), but certain concerns and techniques may be of interest to the wider audience.

When implementing events within the JDK, and specifically within the base module, there are currently a number of approaches: 1) a native implementation in the JVM, or 2) bytecode instrumentation at run time, or 3) a mirror event. The first, a native implementation in the JVM, is most appropriate for low-level operations of the JVM itself, e.g. GC sta

@ChrisHegarty
ChrisHegarty / loom_net.md
Last active July 8, 2024 16:50
Networking I/O with Virtual Threads - Under the hood

Networking I/O with Virtual Threads - Under the hood

[Project Loom][loom] is intending to deliver Java VM features and APIs to support easy-to-use, high-throughput lightweight concurrency and new programming models on the Java platform. This brings many interesting and exciting prospects, one of which is to simplify code that interacts with the network. Servers today can handle far larger numbers of open socket connections than the number of threads they can support, which creates both opportunities and challenges.

Unfortunately, writing scalable code that interacts with the network is hard. There is a threshold beyond which the use of synchronous APIs just doesn't scale, because such APIs can block when performing I/O operations, which in turn ties up a thread until the operation becomes ready, e.g. when trying to read data off a socket when there is no data currently available. Threads are (currently) an expensive resource in the Java platform, too costly to have tied up waiting around on I/O operati

@ChrisHegarty
ChrisHegarty / scopes.md
Last active April 21, 2021 15:08
Foreign Memory Access and NIO channels - Going Further

Foreign Memory Access and NIO channels - Going Further

The Java Platform's NIO channels currently only support I/O operations on synchronous channels with byte buffer views over confined memory segments. While somewhat of a limitation, this reflects a pragmatic solution to API constraints, while simultaneously pushing on the design of the [Foreign Memory Access API][foreign] itself.

With the latest evolution of the Foreign Memory Access API (targeting JDK 17), the lifecycle of memory segments is deferred to a higher-level abstraction, a [resource scope][pulling]. A resource scope manages the lifecycle one or more memory segments, and has several different characteristics. We'll take a look at these characteristics in detail, but most notably there is now a way to render a shared memory segment as non-closeable for a period of time. Given this, we can revisit the current limitation on the kinds of memory segments that can be used with NIO channels, as well as the kinds of channels that can make use of

Monitoring Deserialization to Improve Application Security

Many Java frameworks rely on serialization and deserialization for exchanging messages between JVMs on different computers, or persisting data to disk. Monitoring deserialization can be helpful for application developers who use such frameworks, as it provides insight into the low level data which ultimately flows into the application. This insight, in turn, helps to configure [serialization filtering][filter], a mechanism introduced in Java 9 to prevent vulnerabilities by screening incoming data before it reaches the application. Framework developers may also benefit from a more efficient and dynamic means of monitoring the deserialization activity performed by their framework.

Unfortunately, monitoring deserialization was hard because it required advanced knowledge of how the Java class libraries perform deserialization. For example, you would have to use brittle techniques like debugging or instrumenting calls to methods in `java.io.Obje

@ChrisHegarty
ChrisHegarty / SerializableRecords.md
Last active July 27, 2020 15:07
Serializable Records

Serializable Records

A record is a nominal tuple - a transparent, shallowly immutable carrier for a specific ordered sequence of elements. There are many interesting aspects of record classes, as can be read in Brian Goetz's spotlight article, but here we will focus on one of the lesser known aspects, record serialization, and how it differs from (and I would argue is better than) serialization of normal classes.

While the concept of serialization is quite simple, it often gets complicated very quickly given the various customizations that can be applied. For records we wanted to keep things as simple and straightforward as possible, so:

  1. Serialization of a record object is based only on its state components.
@ChrisHegarty
ChrisHegarty / RecordUtil.java
Created June 29, 2020 14:59
Store j.l.rMethods in RecordUtil
package com.fasterxml.jackson.databind.util;
import com.fasterxml.jackson.databind.introspect.AnnotatedClass;
import com.fasterxml.jackson.databind.introspect.AnnotatedConstructor;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.util.Arrays;
/**