FAST Impulse – JDBC Connector Out of Memory Expection

We run a FAST ESP + Impulse implementation as our search platform. The main data load activities use the ELXT file format and the EXLT and JDBC Connectors to load documents into FAST index.

When we added a new content source to our loading process and created a new JDBC connector instance to load its data, we started getting an out of memory exception from the JDBC connector.

The error message I found on the connector’s log was:

Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
at com.microsoft.sqlserver.jdbc.TDSPacket.<init>(Unknown Source)
at
com.microsoft.sqlserver.jdbc.TDSReader.readPacket(Unknown Source)
at
com.microsoft.sqlserver.jdbc.TDSReader.readPacket(Unknown Source)
at
com.microsoft.sqlserver.jdbc.TDSReader.readResponse(Unknown Source)
at
com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(Unknown Source)
at
com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.doExecutePreparedStatement(Unknown
Source)
at
com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement$PrepStmtExecCmd.doExecute(Unknown
Source)
at
com.microsoft.sqlserver.jdbc.TDSCommand.execute(Unknown Source)
at
com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(Unknown
Source)
at
com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(Unknown
Source)
at
com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(Unknown
Source)
at
com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeQuery(Unknown
Source)
at
com.fastsearch.components.jdbcconnector.JdbcDocumentIterator.<init>(JdbcDocumentIterator.java:256)
at
com.fastsearch.components.jdbcconnector.JdbcAccess.iterator(JdbcAccess.java:224)
at
com.fastsearch.components.jdbcconnector.JdbcConnector.processDocuments(JdbcConnector.java:835)
at
com.fastsearch.components.jdbcconnector.JdbcConnector.runConnector(JdbcConnector.java:803)
at
com.fastsearch.components.jdbcconnector.JdbcConnector.main(JdbcConnector.java:1897)
Exception in thread “SubmitterThread” java.lang.OutOfMemoryError: Java
heap space
at
org.apache.commons.httpclient.ChunkedInputStream.exhaustInputStream(ChunkedInputStream.java:367)
at
org.apache.commons.httpclient.ContentLengthInputStream.close(ContentLengthInputStream.java:117)
at java.io.FilterInputStream.close(FilterInputStream.java:159)
at
org.apache.commons.httpclient.AutoCloseInputStream.notifyWatcher(AutoCloseInputStream.java:176)
at
org.apache.commons.httpclient.AutoCloseInputStream.close(AutoCloseInputStream.java:140)
at
org.apache.commons.httpclient.HttpMethodBase.releaseConnection(HttpMethodBase.java:1078)
at
com.fastsearch.esp.content.http.SessionFactory.releaseConnection(SessionFactory.java:259)
at
com.fastsearch.esp.content.http.Session.processCall(Session.java:477)
at
com.fastsearch.esp.content.http.Session.process(Session.java:372)
at
com.fastsearch.esp.content.http.Dispatcher.send(Dispatcher.java:1104)
at
com.fastsearch.esp.content.http.ContentManager.submitContentOperations(ContentManager.java:247)
at
com.fastsearch.esp.content.http.ContentManager.submitContentOperations(ContentManager.java:206)
at
com.fastsearch.esp.content.feeding.DocumentSubmitter.doSubmitBatch(DocumentSubmitter.java:279)
at
com.fastsearch.esp.content.feeding.DocumentSubmitter.submitBatch(DocumentSubmitter.java:258)
at
com.fastsearch.esp.content.feeding.DocumentSubmitter.run(DocumentSubmitter.java:170)
at java.lang.Thread.run(Thread.java:595)
Full thread dump Java HotSpot(TM) Client VM (1.5.0_22-b03 mixed mode, sharing):

“DestroyJavaVM” prio=6 tid=0x00108510 nid=0xfb4 waiting on condition
[0x00000000..0x000bfab0]

“BatchTimerThread” prio=6 tid=0x03876588 nid=0x1b00 waiting on
condition [0x03e1f000..0x03e1fc30]
at java.lang.Thread.sleep(Native Method)
at
com.fastsearch.esp.content.http.BatchTimer.run(BatchTimer.java:32)
at java.lang.Thread.run(Thread.java:595)

“CallbackPollThread” prio=6 tid=0x035e33a8 nid=0xd24 in Object.wait()
[0x03bbf000..0x03bbfcb0]
at java.lang.Object.wait(Native Method)
– waiting on <0x24924490> (a java.lang.Object)
at java.lang.Object.wait(Object.java:474)
at
com.fastsearch.esp.content.http.Session.waitForActiveBatches(Session.java:292)
– locked <0x24924490> (a java.lang.Object)
at com.fastsearch.esp.content.http.Session.run(Session.java:244)
at java.lang.Thread.run(Thread.java:595)

“CallbackHandlerThread” prio=6 tid=0x033fa800 nid=0xec0 in
Object.wait() [0x03d9f000..0x03d9fd30]
at java.lang.Object.wait(Native Method)
– waiting on <0x249244f8> (a java.lang.Object)
at java.lang.Object.wait(Object.java:474)
at
no.fast.util.SynchronizedQueue.dequeue(SynchronizedQueue.java:148)
– locked <0x249244f8> (a java.lang.Object)
at
no.fast.util.SynchronizedQueue.dequeue(SynchronizedQueue.java:97)
at
com.fastsearch.esp.content.http.CallbackHandler.run(CallbackHandler.java:145)
at java.lang.Thread.run(Thread.java:595)

“MultiThreadedHttpConnectionManager cleanup” daemon prio=6
tid=0x03610bb0 nid=0x1970 in Object.wait() [0x03c3f000..0x03c3fab0]
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:120)
– locked <0x2488d058> (a java.lang.ref.ReferenceQueue$Lock)
at
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ReferenceQueueThread.run(MultiThreadedHttpConnectionManager.java:1082)

“Thread-0” daemon prio=6 tid=0x034daa98 nid=0xd34 waiting on condition
[0x03a2f000..0x03a2fbb0]
at java.lang.Thread.sleep(Native Method)
at
org.apache.log4j.helpers.FileWatchdog.run(FileWatchdog.java:95)

“Low Memory Detector” daemon prio=6 tid=0x00f7ba68 nid=0x16bc runnable
[0x00000000..0x00000000]

“CompilerThread0” daemon prio=10 tid=0x00faaf38 nid=0x161c waiting on
condition [0x00000000..0x0326fa10]

“Signal Dispatcher” daemon prio=10 tid=0x00f7b0e0 nid=0xdc waiting on
condition [0x00000000..0x00000000]

“Finalizer” daemon prio=8 tid=0x00f78de8 nid=0x1688 in Object.wait()
[0x0316f000..0x0316fa30]
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:120)
– locked <0x24833758> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:136)
at
java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

“Reference Handler” daemon prio=10 tid=0x00f78240 nid=0x1978 in
Object.wait() [0x030ef000..0x030efab0]
at java.lang.Object.wait(Native Method)
– waiting on <0x248337e0> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:474)
at
java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
– locked <0x248337e0> (a java.lang.ref.Reference$Lock)

“VM Thread” prio=10 tid=0x00fa2d48 nid=0x1ba0 runnable

“VM Periodic Task Thread” prio=10 tid=0x00f6af18 nid=0xefc waiting on condition

Reading the error message we could conclude the heap was running out of memory while processing the documents we were trying to load.

Checking the JAVA virtual machine configuration we found out the maximum memory for JAVA processes was set to 64 MB.

We changed it to 300 MB specifically for the JDBC connector.

In order to do it, we changed the JAVA options on the connector.windows.conf file under the FAST_ESP_Root\components\jdbcconnector\bin folder.

The options parameter was set to: JAVA_OPTS=-Xmx300m

After it we restarted the connector using the command: nctrl restart jdbc_connector_name

Immediately the documents started been processed with no error messages on the JDBC connector log.

See you,

Amadeu.

Advertisements

FAST Impulse – SQL Queries to Monitor the JDBC Connector Execution

If you run FAST ESP + Impulse and use the Impulse and the JDBC connectors to load data into FAST you probably need to monitor how the connector is running and processing documents for each of your collections.

I use the following SQL queries to get the current status of the items on the Impulste Items database (ImpulseItems schema).

This query shows all the documents per collection and per different statuses:

SELECT collection_name, update_flag, COUNT(*)
FROM ImpulseItems.status (NOLOCK)
GROUP BY collection_name, update_flag
ORDER BY collection_name

This query shows the statuses fo the documents for a specific collection:

SELECT update_flag, COUNT(*)
FROM ImpulseItems.status (NOLOCK)
WHERE collection_name = '[collection name]'
GROUP BY update_flag

The status is defined by the update_flag field.

The update_flag field value means:

  • < -1: document is being deleted by one of the JDBC connector instances. This number represents the number of the JDBC process defined on the NodeConf.xml file.
  • -1: document should be deleted when the JDBC connector instance runs.
  • 0: no updates on the document / document doesn’t need to be processed.
  • 1: document has been update and needs to be processed.
  • between 2 and 221: document is being processed by one of the JDBC connector instances. This number represents the number of the JDBC process defined on the NodeConf.xml file.
  • 222: document is being loaded by the EXLT connector.
  • 333: document is locked by EML server.

It is important for you to run the queries using the NO LOCK query hint in order to avoid interference on the execution of the connector (no blocking on SQL Server processes).

See you,

Amadeu.