Monday, April 11, 2011

How to check if a unicode character is within given range in C?

The following function was written for java and has been adapted for C.

bool isFullwidthKatakana(WideChar C)
{
  return(('\u30a0'<=C)&&(C<='\u30ff'));
}

The problem is that my framework ("CodeGear C++Builder") shows this error:

[BCC32 Warning] Unit1.cpp(101): W8114 Character represented by universal-character-name '\u30a0' cannot be represented in the current code page (1252)

and it does not return true whether the conditions are met.

For example one input is 'ア' (0x30A2).

What should I do? How can I change the code page?

Thank you to the three answers they all resolved it.

return((0x30a0<=C)&&(C<=0x30ff));

It seems the that the expression \u30a0 wasn't correct, this all were correct

return((0x30a0<=C)&&(C<=0x30ff));
return (unsigned int) C >= 0x30a0u && (unsigned int) C <= 0x30ffu;
return((L'\u30a0'<=C)&&(C<=L'\u30ff'));
From stackoverflow
  • It should be possible to cast (explicitly or implicitly) the character to an unsigned integer, and then just use such constants:

    return (unsigned int) C >= 0x30a0u && (unsigned int) C <= 0x30ffu;
    

    should do it.

    By the way, I'd recommend against using a (single-character) uppercase argument name, it's very easy to think it's a compile-time constant (which are typically uppercase in C and C++).

  • IIUC, you need to check if a wide unicode character (probably utf-16 since you're on Windows) is within a range. This can be done with the code you've shown, you just have to make the character literals wide character literals. In C++ and C, they are made by prepending L to the literal, eg. L'a' or L"ahoj".

    In your case, I'd try

    bool isFullwidthKatakana(WideChar C)
    {
      return((L'\u30a0'<=C)&&(C<=L'\u30ff'));
    }
    
  • The error appears to be related to the use of the character literal, not the test. So test against the codepoint as an integral literal, eg:

    bool isFullwidthKatakana(WideChar C)
    {
      return(( (WideChar)0x30a0 <= C )&&(C <= (WideChar)0x30ff ));
    }
    

0 comments:

Post a Comment