Current location - Plastic Surgery and Aesthetics Network - Plastic surgery and beauty - Find a C++ program to filter out Chinese from a document containing both Chinese and English. thank you
Find a C++ program to filter out Chinese from a document containing both Chinese and English. thank you
//The program has been running under vc++6.0.

# include & ltiostream & gt

# include & lt string & gt

# include & ltfstream & gt

Use namespace std

int main()

{

char text _ name[20];

Cout & lt& lt "Please enter the file name (including extension) of the text document to be read \ n"; //Enter the text file you want to read, such as 1.txt.

CIN & gt; & gttext _ name

if stream in(text _ name);

Cout & lt& lt "Please specify the file name (including extension) for saving Chinese characters"; //Enter the name of the text file to save Chinese characters, such as 2.txt.

CIN & gt; & gttext _ name

of stream out(text _ name);

And (! In.eof())// Read the characters in the file until the end of the file.

{

char c;

At>& gtc;;

If(in.eof())// When reading the last character, exit directly to avoid saving the last character in a unique file.

Break;

if(c & lt; 0 | | c >; 255)// I just tested the character values of some commonly used Chinese characters (all of which are very large after being converted into plastic)

//,and the characters we know, including the line-changing light you said, are between 0 and 255.

//So you can use this sentence to get rid of those non-Chinese characters)

out & lt& ltc;

}

//The following lines of code read Chinese characters from the file you saved to the console to check whether the program is reasonable.

in . close(); //

out . close(); //

ifstream in 1(text _ name); //

For (string s; getline(in 1,s); )//

cout & lt& lts & lt& ltendl//

Returns 0;

}

//I searched Baidu just now, and there is no effective way to distinguish Chinese characters from Chinese characters, so I thought of a bold method, but this method has many limitations.

//Because the program cannot guarantee some special characters (such as ellipsis, etc. ) was saved to the file by mistake.

//Finally, I hope my program is of some use to you. If there is any better way, please share it with us. ...