ONTAP Discussions

Failed to create a file with surrogate pair characters UTF-16

JP1
12,079 Views

We are running in to a problem when copying files to a CIFS share running on CDOT 9.4.  All files that contains 'special' characters in the file name is rejected.  We need to store forensics dumps of cell phone data with emojis and other off the wall characters.  As these are evidence, we cannot alter the files to comply with the base UTF character set that NetApp uses

 

The email alert (see below) that gets generated has some steps to enable UTF surrogate pairs, but the syntax isn't valid (as far as I can tell) for CDOT 9.4.  So, I have two/three questions:

1.  What is the ramification of enabling UTF surrogate pairs (should we do it)?

2.  What is the proper syntax to implement the change?

 2.a.  How do we make the change persistent across reboots?

 

 

Subject: NODE_2: wafl.dir.surrpair.filename [LOG_ERR]

Filer: NODE_2

Time: Fri, Sep 20 09:50:41 2019 -0400

Severity: LOG_ERR

 

Message: wafl.dir.surrpair.filename: Failed to create a file with surrogate pair characters in the name in the directory /vol/<redacted>.

 

Description: This message occurs as a warning when a file name with surrogate pair characters in UTF-16 encoding fails to be created in a parent directory.

 

Action: To allow names with surrogate pairs to be created, use the following

command: "setflag wafl_reject_surrogate_pair 0". If the option needs to be set across reboots, set the bootarg 'wafl-accept-surrogate-pair?' to "true" at the LOADER prompt.

 

Source: wafl_exempt00

Index: 14321181

 

1 ACCEPTED SOLUTION

Ontapforrum
12,048 Views

Hi, 

 

In addition to last response, some more info.

 

 

There is a KB article on this issue: [Have mentioned the content below]

https://kb.netapp.com/app/answers/answer_view/a_id/1008772/loc/en_US#__highlight


Cause:
clustered Data ONTAP prior to 9.5 supports only Unicode from the basic multilingual plane (UCS-2), so it does not handle Unicode that requires more than 16 bits to represent a character, such as emojis and other surrogate pairs.


Solution
9.5 added a new volume language utf8mb4.

 

 

As you have 9.4, following workaround is suggested:

 

Workaround:
::> node run -node <Node name>
Filer> priv set diag
Filer*> setflag wafl_reject_surrogate_pair 0
Filer*> printflag wafl_reject_surrogate_pair wafl_reject_surrogate_pair =0

 

To make this change persistent across reboots, add the following command to the /etc/rc file:
priv set diag;setflag wafl_reject_surrogate_pair 0;priv set admin

View solution in original post

6 REPLIES 6

aborzenkov
12,053 Views

Setflag is nodeshell diag level command.

 

node run local

priv set diag 

setflag ...

JP1
12,031 Views

 

Thanks for that tip.  I looked for setflag under set -priv diag, but not under local node.  Indeed setflag is recognized there.

If I understand this, then I need to set the flag on each node that might host this volume?

Ontapforrum
12,049 Views

Hi, 

 

In addition to last response, some more info.

 

 

There is a KB article on this issue: [Have mentioned the content below]

https://kb.netapp.com/app/answers/answer_view/a_id/1008772/loc/en_US#__highlight


Cause:
clustered Data ONTAP prior to 9.5 supports only Unicode from the basic multilingual plane (UCS-2), so it does not handle Unicode that requires more than 16 bits to represent a character, such as emojis and other surrogate pairs.


Solution
9.5 added a new volume language utf8mb4.

 

 

As you have 9.4, following workaround is suggested:

 

Workaround:
::> node run -node <Node name>
Filer> priv set diag
Filer*> setflag wafl_reject_surrogate_pair 0
Filer*> printflag wafl_reject_surrogate_pair wafl_reject_surrogate_pair =0

 

To make this change persistent across reboots, add the following command to the /etc/rc file:
priv set diag;setflag wafl_reject_surrogate_pair 0;priv set admin

JP1
12,016 Views

 

Thanks for that link.  That's just the article I was looking for.  If I'm understanding this correctly, we would need to upgrade to 9.5+, and also create a new volume for the CIFS share, correct?  In that scenario it seems we would no longer need to use setflag wafl_reject_surrogate_pair 0 .  Is that also a correct statement?

Ontapforrum
11,995 Views

Yes, that's absolutely correct. 

 

With 7-mode, we could change the existing vol language and had to reboot.  However, with cDOT, vol language cannot be changed.  In cDOT, vol language is inherited from SVM language, that could be change but will onyl allow new volumes to inherit it but existing will remain the same.

 

Therefore, I agree with you - Upgrade to 9.5 and then create a new volume with language 'utf8mb4'.

MStubbs
10,967 Views

With ontap 9.x, how does one make this persistent across reboots?  /etc/rc no longer seems to exist

Public