The mechanism of Web2.0 encourages the Web users to devote themselves to knowledge construction of a specific topic. For example, the online encyclopedia Wikipedia consists of thousands of valuable articles contributed by numerous knowledge providers. Online E-commerce websites, such as Amazon, allow users post their reviews about products which are helpful to the subsequent users to judge the product quality. Usually, knowledge providers come with different cultures and background knowledge so that the reviews they composed may comprise different opinions. As a result, the aggregated knowledge may involve perspectives of different orientations. Moreover, popular topics or products would receive a lot of reviews that the aggregated knowledge can be huge and incomprehensible to users. Hence, it would be a great benefit for users if the perspectives of the reviews are well organized.
Reviews on the Web are represented by a set of Web documents. Mining perspectives embedded in a set of documents is a popular text mining problem. Mining methods, such as k-means clustering algorithm or latent semantic indexing, partition the documents into content coherent clusters that each of which represents a perspective of the documents. However, in Web2.0, reviews left by knowledge providers might be in opposing due to the culture difference. While previous mining methods lack a mechanism to identify the opposition in the text, we provide a method to discover the bipolar orientation. In this work, we utilize statistic principal components analysis (PCA) technique to find out the bipolar orientation embedded in the text. PCA constructs the covariance (or correlation coefficient) matrix of the reviews and treat the eigenvectors of the matrix as meaningful perspectives. Then, the bipolar orientation of perspectives can be extracted by analyzing the constitution of the eigenvectors. The results can be very useful for users in comprehend the aggregated online knowledge.