Skip to content

Commit ac5f929

Browse files
committed
GH-5723: maintain documentation for http client update
programming/setup.md Removed the outdated paragraph that instructed users to include the jcl-over-slf4j bridge because "RDF4J internally uses the Apache Commons HttpClient, which relies on JCL". Since AHC4 is gone, this requirement no longer exists. The surrounding general advice about SLF4J bridges and the "single logger config" conclusion were preserved. programming/repository.md Replaced the brief "Configuring the HTTP session thread pool" section with a fuller "Configuring the HTTP client" section covering: - HTTP client backends — Apache HC5 (default) vs JDK, and the rdf4j.http.client.factory system property - Connection pooling, timeouts, and SSL — RDF4JHttpClientConfig builder example, system properties table, and HttpClientBuilders.getSslTrustAllConfig() for SSL trust-all - Thread pool — existing content preserved as a sub-section - Authentication — BasicAuthenticationHandler and BearerTokenAuthenticationHandler examples The authentication section uses session.setAuthenticationHandler(...)
1 parent dd4b2bb commit ac5f929

2 files changed

Lines changed: 67 additions & 8 deletions

File tree

site/content/documentation/programming/repository.md

Lines changed: 66 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -226,14 +226,59 @@ Repository repo = new SPARQLRepository(sparqlEndpoint);
226226

227227
After you have done this, you can query the SPARQL endpoint just as you would any other type of Repository.
228228

229-
#### Configuring the HTTP session thread pool
229+
#### Configuring the HTTP client
230230

231231
Both the HTTPRepository and the SPARQLRepository use the SPARQL Protocol over
232232
HTTP under the hood (in the case of the HTTPRepository, it uses the extended
233233
RDF4J REST API). The HTTP client session is managed by the {{< javadoc
234234
"HttpClientSessionManager"
235-
"http/client/HttpClientSessionManager.html" >}}, which in turn depends
236-
on the Apache HttpClient.
235+
"http/client/HttpClientSessionManager.html" >}}.
236+
237+
##### HTTP client backends
238+
239+
RDF4J ships with two HTTP client backends:
240+
241+
- **Apache HttpComponents 5** (`rdf4j-http-client-apache5`) — the preferred backend when multiple backends are available.
242+
- **JDK built-in HTTP client** (`rdf4j-http-client-jdk`) — a zero-dependency alternative using `java.net.http.HttpClient`.
243+
244+
Both are included as runtime dependencies of `rdf4j-http-client`. Backend selection works as follows: if the system property `rdf4j.http.client.factory` is set, that backend is used; otherwise RDF4J prefers `apache5` when it is available, and if only one backend factory is present on the classpath, that backend becomes the default.
245+
246+
##### Connection pooling, timeouts, and SSL
247+
248+
HTTP client settings are configured via {{< javadoc "RDF4JHttpClientConfig" "http/client/spi/RDF4JHttpClientConfig.html" >}}, which is built using a fluent builder and passed to the repository:
249+
250+
```java
251+
import org.eclipse.rdf4j.http.client.spi.RDF4JHttpClientConfig;
252+
import org.eclipse.rdf4j.http.client.spi.RDF4JHttpClients;
253+
254+
RDF4JHttpClientConfig config = RDF4JHttpClientConfig.newBuilder()
255+
.connectTimeoutMs(5_000)
256+
.socketTimeoutMs(30_000)
257+
.maxConnectionsPerRoute(10)
258+
.maxConnectionsTotal(25)
259+
.build();
260+
261+
repo.setHttpClient(RDF4JHttpClients.newDefaultClient(config));
262+
```
263+
264+
Connection pool size and timeouts can also be set globally via system properties on {{< javadoc "SharedHttpClientSessionManager" "http/client/SharedHttpClientSessionManager.html" >}}:
265+
266+
| System property | Default | Description |
267+
|---|---|---|
268+
| `org.eclipse.rdf4j.client.http.maxConnPerRoute` | 25 | Max connections per host |
269+
| `org.eclipse.rdf4j.client.http.maxConnTotal` | 50 | Max total connections |
270+
| `org.eclipse.rdf4j.client.http.connectionTimeout` | 30 000 ms | TCP connection timeout |
271+
| `org.eclipse.rdf4j.client.http.connectionRequestTimeout` || Time to wait for a pooled connection |
272+
273+
For SSL trust-all (e.g. self-signed certificates in test environments), use the {{< javadoc "HttpClientBuilders" "http/client/util/HttpClientBuilders.html" >}} utility. **Warning:** this disables SSL certificate validation and hostname verification, and must not be used in production.
274+
275+
```java
276+
import org.eclipse.rdf4j.http.client.util.HttpClientBuilders;
277+
278+
repo.setHttpClient(RDF4JHttpClients.newDefaultClient(HttpClientBuilders.getSslTrustAllConfig()));
279+
```
280+
281+
##### Thread pool
237282

238283
The session uses a caching thread pool executor to handle multithreaded
239284
access to a remote endpoint, defined by default to use a thread pool with a
@@ -243,6 +288,24 @@ To configure this to use a different core pool size, you can specify the
243288
`org.eclipse.rdf4j.client.executors.corePoolSize` system property with a
244289
different number.
245290

291+
##### Authentication
292+
293+
For HTTP-based repositories, basic authentication can be configured directly on the repository. If you need bearer-token authentication, configure a custom session manager that installs an {{< javadoc "AuthenticationHandler" "http/client/spi/AuthenticationHandler.html" >}} on each created {{< javadoc "SPARQLProtocolSession" "http/client/SPARQLProtocolSession.html" >}}:
294+
295+
```java
296+
import org.eclipse.rdf4j.http.client.spi.BasicAuthenticationHandler;
297+
import org.eclipse.rdf4j.http.client.spi.BearerTokenAuthenticationHandler;
298+
299+
// Basic authentication
300+
session.setAuthenticationHandler(new BasicAuthenticationHandler("user", "secret"));
301+
302+
// Bearer token (static)
303+
session.setAuthenticationHandler(new BearerTokenAuthenticationHandler("my-token"));
304+
305+
// Bearer token (dynamic, e.g. refreshing OAuth access token)
306+
session.setAuthenticationHandler(new BearerTokenAuthenticationHandler(tokenStore::currentToken));
307+
```
308+
246309
### The RepositoryManager and RepositoryProvider
247310

248311
Using what we’ve seen in the previous section, we can create and use various different types of repositories. However, when developing an application in which you have to keep track of several repositories, sharing references to these repositories between different parts of your code can become complex. The {{< javadoc "RepositoryManager" "repository/manager/RepositoryManager.html" >}} and {{< javadoc "RepositoryProvider" "repository/manager/RepositoryProvider.html" >}} provide one central location where all information on the repositories in use (including id, type, directory for persistent data storage, etc.) is kept.

site/content/documentation/programming/setup.md

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -134,10 +134,6 @@ What you need to do is to decide which logging implementation you are going to u
134134

135135
One thing to keep in mind when configuring logging is that SLF4J expects only a single logger implementation on the classpath. Thus, you should choose only a single logger. In addition, if parts of your code depend on projects that use other logging frameworks directly, you can include a Legacy Bridge which makes sure calls to the legacy logger get redirected to SLF4J (and from there on, to your logger of choice).
136136

137-
In particular, when working with RDF4J’s HTTPRepository or SPARQLRepository libraries, you should include the `jcl-over-slf4j` legacy bridge. This is because RDF4J internally uses the Apache Commons HttpClient, which relies on JCL (Jakarta Commons Logging). You can do without this if your own app is a webapp, to be deployed in e.g. Tomcat, but otherwise, your application will probably show a lot of debug log messages on standard output, starting with something like:
138-
139-
DEBUG httpclient.wire.header
140-
141-
When you set this up correctly, you can have a single logger configuration for your entire project, and you will be able to control both this kind of logging by third party libraries and by RDF4J itself using this single config.
137+
When you set this up correctly, you can have a single logger configuration for your entire project, and you will be able to control logging by third party libraries and by RDF4J itself using this single config.
142138

143139
The RDF4J framework itself does not prescribe a particular logger implementation (after all, that’s the whole point of SLF4J, that you get to choose your preferred logger). However, several of the applications included in RDF4J (such as RDF4J Server, Workbench, and the command line console) do use a logger implementation. The server and console application both use logback, which is the successor to log4j and a native implementation of SLF4J. The Workbench uses `java.util.logging` instead.

0 commit comments

Comments
 (0)