ESQL: Limit Replace function memory usage #127924

ivancea · 2025-05-08T16:17:27Z

The Replace string result limit was fixed to 1MB, same as Repeat

elasticsearchmachine · 2025-05-08T16:17:52Z

Hi @ivancea, I've created a changelog YAML for you.

ivancea · 2025-05-08T16:19:11Z

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Replace.java

@@ -121,15 +125,15 @@ public boolean foldable() {
        return str.foldable() && regex.foldable() && newStr.foldable();
    }

-    @Evaluator(extraName = "Constant", warnExceptions = PatternSyntaxException.class)
+    @Evaluator(extraName = "Constant", warnExceptions = IllegalArgumentException.class)


PatternSyntaxException is an IllegalArgumentException. If we choose another exception for this PR, we would have to restore PatternSyntaxException

ivancea · 2025-05-08T16:20:01Z

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Replace.java

@@ -39,6 +41,8 @@
 public class Replace extends EsqlScalarFunction {
    public static final NamedWriteableRegistry.Entry ENTRY = new NamedWriteableRegistry.Entry(Expression.class, "Replace", Replace::new);

+    static final long MAX_RESULT_LENGTH = MB.toBytes(1);


This is based on the same Repeat logic:

elasticsearch/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Repeat.java

Line 43 in c990377

static final long MAX_REPEATED_LENGTH = MB.toBytes(1);

We might want to shove this in some common spot. I have no idea what a good one is. And probably javadoc it too.

nit: We could make both use the same constant, then future readers can easily see that it's being used in 2 cases which tackle a common problem.

Moved it to ScalarFunction. I don't expect aggs to need this, but we could move it later

ivancea · 2025-05-08T16:26:09Z

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Replace.java

+        do {
+            m.appendReplacement(result, newStr);
+
+            if (result.length() > MAX_RESULT_LENGTH) {


We're checking after the replacement. This has a downside: if the replacement uses a group ($1), it could, theoretically, still generate a big string (E.g. Replace("<100 chars>", ".+", "$0<...x 100>") == 10.000 chars.
We could take this into account, and use something like:

int matchSize = m.end() - m.start(); int potentialReplacementSize = matchSize * matchSize / 2; // "$0" == 1 repetition int remainingStr = str.length() - m.end(); if (result.length + potentialReplacementSize + remainingStr > MAX_RESULT_LENGTH) { throw ...: }

Opinions? How predictive should we be?

Hmm, the estimate matchSize * newStr.length() could be too conservative. If I have a 2kb input string, the regex matches the whole thing, and I want to replace it by another 1kb string (no group refs!), this would trigger as a false positive. Not sure how restrictive this is in practice.

Tt looks like appendReplacement only allocates memory through the string builder even in case of group references, although I could've missed something - only skimmed the implementation. But, if we want to be precise, we could maybe inherit from StringBuilder and make it perform our more strict checks every time we append? I don't know if the effort is worth it.

Maybe a simpler approach is to count how many group references newStr actually has, but that could tank performance :/

@nik9000, what do you think?

I'm not sure either. I was chatting with @ivancea over slack. I kind of think we should take our gloves off and build this directly appending to a BreakingBytesRefBuilder. That way we're reusing the bytes and don't have to copy to encode to utf-8. We have the JVM as a testing oracle so we can clean room implement it. That's probably a few days of work though. All because we don't have proper monomorphization.

Just pushed this algorithm, taking into account the number of groups. So it's quite more specific. Still quite conservative, but not that much I think? And if you don't use groups, it's nearly pixel-perfect

alex-spies

Fix looks correct to me, thanks @ivancea !

alex-spies · 2025-05-08T20:17:56Z

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Replace.java

@@ -39,6 +41,8 @@
 public class Replace extends EsqlScalarFunction {
    public static final NamedWriteableRegistry.Entry ENTRY = new NamedWriteableRegistry.Entry(Expression.class, "Replace", Replace::new);

+    static final long MAX_RESULT_LENGTH = MB.toBytes(1);


nit: We could make both use the same constant, then future readers can easily see that it's being used in 2 cases which tackle a common problem.

alex-spies · 2025-05-08T20:28:09Z

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Replace.java

+            if (result.length() > MAX_RESULT_LENGTH) {
+                throw new IllegalArgumentException("Creating strings with more than [" + MAX_RESULT_LENGTH + "] bytes is not supported");
+            }
+        } while (m.find());


Comparing this with the implementation of replaceAll, this seems to be equivalent, so I think we don't accidentally change the semantics. Nice.

alex-spies · 2025-05-08T20:49:48Z

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Replace.java

+        do {
+            m.appendReplacement(result, newStr);
+
+            if (result.length() > MAX_RESULT_LENGTH) {


Hmm, the estimate matchSize * newStr.length() could be too conservative. If I have a 2kb input string, the regex matches the whole thing, and I want to replace it by another 1kb string (no group refs!), this would trigger as a false positive. Not sure how restrictive this is in practice.

Tt looks like appendReplacement only allocates memory through the string builder even in case of group references, although I could've missed something - only skimmed the implementation. But, if we want to be precise, we could maybe inherit from StringBuilder and make it perform our more strict checks every time we append? I don't know if the effort is worth it.

Maybe a simpler approach is to count how many group references newStr actually has, but that could tank performance :/

@nik9000, what do you think?

alex-spies · 2025-05-08T20:51:29Z

...st/java/org/elasticsearch/xpack/esql/expression/function/AbstractScalarFunctionTestCase.java

+        List<Object> simpleData = testCase.getDataValues();
+        // Ensure we don't run this test with too much data that could take too long to process.
+        // The calculation "ramUsed * count" is just a hint of how much data will the function process,
+        // and the limit is arbitrary
+        assumeTrue("Input data too big", row(simpleData).ramBytesUsedByBlocks() * count < GB.toBytes(1));


Why's this needed? Did we make REPLACE super slow?

As I added the test with a lot of data to the ReplaceTests file, this specific case was very slow (3s -> 40s).

Anyway, I just moved the big cases to another file, so they're executed just once. Same as RepeatStaticTests does. So this isn't needed anymore

… groups

…oing the actual replacement

elasticsearchmachine · 2025-05-09T14:51:48Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

nik9000 · 2025-05-08T16:49:21Z

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Replace.java

@@ -39,6 +41,8 @@
 public class Replace extends EsqlScalarFunction {
    public static final NamedWriteableRegistry.Entry ENTRY = new NamedWriteableRegistry.Entry(Expression.class, "Replace", Replace::new);

+    static final long MAX_RESULT_LENGTH = MB.toBytes(1);


We might want to shove this in some common spot. I have no idea what a good one is. And probably javadoc it too.

nik9000 · 2025-05-08T20:57:49Z

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Replace.java

+        do {
+            m.appendReplacement(result, newStr);
+
+            if (result.length() > MAX_RESULT_LENGTH) {


I'm not sure either. I was chatting with @ivancea over slack. I kind of think we should take our gloves off and build this directly appending to a BreakingBytesRefBuilder. That way we're reusing the bytes and don't have to copy to encode to utf-8. We have the JVM as a testing oracle so we can clean room implement it. That's probably a few days of work though. All because we don't have proper monomorphization.

ivancea added 2 commits May 8, 2025 18:14

ESQL: Fix Replace function memory usage

a6f883a

Undo unrelated changes

821e627

ivancea requested review from nik9000 and alex-spies May 8, 2025 16:17

ivancea added >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.1.0 labels May 8, 2025

Update docs/changelog/127924.yaml

992b11e

Undo unrelated change

833e381

github-actions bot deployed to docs-preview May 8, 2025 16:18 View deployment

ivancea commented May 8, 2025

View reviewed changes

alex-spies approved these changes May 8, 2025

View reviewed changes

Centralize max string length for functions, and add Replace tests for…

4aef2f9

… groups

github-actions bot deployed to docs-preview May 9, 2025 13:10 View deployment

[CI] Auto commit changes from spotless

162c91a

github-actions bot deployed to docs-preview May 9, 2025 13:17 View deployment

Updated limiting logic to take groups into account and throw before d…

455b1fd

…oing the actual replacement

github-actions bot deployed to docs-preview May 9, 2025 14:45 View deployment

Merge branch 'main' into esql-replace-function-control-size

49f05ab

ivancea marked this pull request as ready for review May 9, 2025 14:51

github-actions bot deployed to docs-preview May 9, 2025 14:51 View deployment

nik9000 approved these changes May 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESQL: Limit Replace function memory usage #127924

ESQL: Limit Replace function memory usage #127924

ivancea commented May 8, 2025

elasticsearchmachine commented May 8, 2025

ivancea May 8, 2025

ivancea May 8, 2025

nik9000 May 8, 2025

alex-spies May 8, 2025

ivancea May 9, 2025

ivancea May 8, 2025 •

edited

Loading

alex-spies May 8, 2025

nik9000 May 8, 2025

ivancea May 9, 2025 •

edited

Loading

alex-spies left a comment

alex-spies May 8, 2025

alex-spies May 8, 2025

alex-spies May 8, 2025

alex-spies May 8, 2025

ivancea May 9, 2025

elasticsearchmachine commented May 9, 2025

nik9000 May 8, 2025

nik9000 May 8, 2025

ESQL: Limit Replace function memory usage #127924

Are you sure you want to change the base?

ESQL: Limit Replace function memory usage #127924

Conversation

ivancea commented May 8, 2025

elasticsearchmachine commented May 8, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ivancea May 8, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ivancea May 9, 2025 • edited Loading

Choose a reason for hiding this comment

alex-spies left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticsearchmachine commented May 9, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ivancea May 8, 2025 •

edited

Loading

ivancea May 9, 2025 •

edited

Loading