软工二第一次代码作业

什么是kwic?

KWIC是"Keyword in Context"（关键词上下文）的缩写。 KWIC是一种文本分析方法，用于提取文本中的关键词并在其周围显示一定数量的上下文信息。这通常被用于索引和搜索系统中的文本数据，以便用户可以快速找到与其关注的关键词相关的文本。KWIC索引通常显示在一个表格或网格中，其中每一行都是一个关键词及其周围的文本片段，让用户可以快速浏览文本内容，快速找到感兴趣的信息。

貌似还有些相关网站？

语料库网站介绍之KWiC Finder_哔哩哔哩_bilibili

KWiCFinder Web Concordancer & Online Research Tool

直接看点资料

软件体系结构经典问题——KWIC的分析和解决 - youxin - 博客园 (cnblogs.com)

含代码：主程序-子程序、面向对象、事件系统和管道-过滤软件体系结构实现KWIC_Himit_ZH的博客-CSDN博客

要做什么？

**问题陈述：**KWIC(Key Word In Context)，Parnas (1972)

KWIC索引系统接受一些行，每行有若干字，每个字由若干字符组成；每行都可以循环移位，亦即重复地把第一个字删除，然后接到行末； KWIC把所有行的各种移位情况按照字母表顺序输出

例子如下：

原来的还是要输出的，所以一共3+2+3=8个

如何实现？

使用ms（主程序与子程序）实现

KWIC类：

直接使用这四个成员变量多少有点过于困难了，什么年代还在用char数组？

果断放弃了，按照上面的要求的话，一个input函数写了这么多行属实难蚌

/**
 * Input function reads the raw data from the specified file and stores it in the core storage.
 * If some system I/O error occurs the program exits with an error message.
 * The format of raw data is as follows. Lines are separated by the line separator
 * character(s) (on Unix '\n', on Windows '\r\n'). Each line consists of a number of
 * words. Words are delimited by any number and combination of the space chracter (' ')
 * and the horizontal tabulation chracter ('\t'). The entered data is parsed in the
 * following way. All line separators are removed from the data, all horizontal tabulation
 * word delimiters are replaced by a single space character, and all multiple word
 * delimiters are replaced by a single space character. Then the parsed data is represented
 * in the core as two arrays: chars_ array and line_index_ array.
 *
 * @param file Name of input file
 */

public void input(String file) {

    // 实现input
    // 期望处理后的目标
    // All line separators are removed from the data, all horizontal tabulation
    // word delimiters are replaced by a single space character, and all multiple word
    // delimiters are replaced by a single space character.
    try {
        FileReader inputFileReader = new FileReader(file);
        BufferedReader inputFileBufferedReader = new BufferedReader(inputFileReader);

        CharArrayWriter caw = new CharArrayWriter(); // 用来向chars中写入数据
        String line;

        while ((line = inputFileBufferedReader.readLine()) != null) {
            caw.write(line);
        }

        chars_ = caw.toCharArray();
        caw.close();

        // 需要按照要求进一步处理chars_
        // 把所有分隔符都换成单个空格，除了换行符
        String text1 = new String(chars_);
        text1 = text1.replaceAll(" +", " ");
        text1 = text1.replaceAll("\t", " ");
        text1 = text1.replaceAll("\r", "");
        chars_ = text1.toCharArray();

        // 没明白为什么要All line separators are removed from the data
        line_index_ = new int[chars_.length];
        int line_index = 0;
        for (int i = 0; i < chars_.length; i++) {
            if (chars_[i] == ' ') {
                line_index_[i] = line_index;
            } else if (chars_[i] == '\n') { //只判断“/n”，不用判断“/r/n”，已经涵盖了
                line_index_[i] = line_index;
                line_index++;
            }
        }
    } catch (FileNotFoundException e) {
        System.out.println("File not found!");
        System.exit(1);
    } catch (IOException e) {
        System.out.println("IO error!");
        System.exit(1);
    }
}

使用OO实现

Alphabetizer类

用于排序

为了便于修改排序算法

在该类中新建一个内部类，对按行进行排序

单纯实现某两个类的自定义排序，直接使用Comparator

就行

使用Comparator比较接口实现ArrayList集合排序_惟念依的博客-CSDN博客

但如果像这道题一样还要获取排序后的Index就不行了，需要使用comparable接口

ChrisDing's bblog