In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
1.Oracle Text index type
Index type description supported preferences and parameter query operator considerations CONTEXT use this index to build text retrieval applications when text consists of large, coherent documents (for example, MS Word,HTML or plain text).
Indexes can be customized in several ways.
This index type requires CTX_DDL.SYNC_INDEX after inserting, updating, and deleting the base table. All CREATE INDEX preferences and parameters are supported, except for INDEX SET.
Supported parameters: index partition clause format, character set and language column CONTAINS
The CONTEXT syntax supports a rich set of operations.
Use CTXCAT syntax with query templates. All document services and query services are supported.
Indexes that support partitioned text tables.
The FILTER BY and ORDER BY clauses of CREATE INDEX are supported to index structured column values to handle mixed queries more efficiently. CTXCAT uses this index to better mix the query performance of small documents and text fragments. To improve the performance of mixed queries, include other columns in the base table, such as project name, price, and description.
This index type is transactional. After inserting, updating, or deleting the base table, it automatically updates itself. No CTX_DDL.SYNC_INDEX is required.
INDEX SET
LEXER
STOPLIST
STORAGE
WORDLIST (the prefix_index attribute is supported only for Japanese data. )
Not supported: format, character set and language column
Table and index partition CATSEARCH
CTXCAT syntax supports logical operations, phrase queries and wildcards.
Use CONTEXT syntax with query templates.
Topic query is supported. This index is larger and takes longer to build than the CONTEXT index.
The size of the CTXCAT index is related to the total number of text to be indexed, the number of indexes in the index set, and the number of columns being indexed. Consider your queries and resources carefully before adding indexes to the index set.
CTXCAT indexes do not support index partitioning, document services (highlighting, tags, topics and points) or query services (explanations, query feedback and browsing words). CTXRULE uses this index to build document classification or routing applications. Create this index on the query table, where the query defines classification or routing conditions
MATCHES
You can use the MATCHES operator to classify individual documents (plain text, HTML, or XML). MATCHES converts the document into a set of queries and finds matching rows in the index.
To build a document classification application using simple or rule-based classifications, create an index of type CTXRULE. The index uses the MATCHES operator to classify plain text, HTML, or XML documents. Stores the defined query set in the text table of the index. The Oracle Text index is the Oracle database domain index. To build a query application, you can create an index of type CONTEXT that mixes text and structured data columns and query using the CONTAINS operator. Create an index from a populated text table. In a query application, the table must contain text or a pointer to the location of the stored text. Text is usually a collection of documents, but it can also be small pieces of text. Note: if you are building a new application that uses XML data, Oracle recommends that you use XMLIndex instead of CTXRULE. Use the standard SQL to create the Oracle Text index as an extensible index of the Oracle database. This means that Oracle Text indexes run in a similar way to Oracle Database indexes. It has a name that references it and can be manipulated using standard SQL statements. The advantage of creating Oracle Text indexes is that you can respond to text queries quickly using the CONTAINS,CATSEARCH and MATCHES operators. These operators query the CONTEXT,CTXCAT and CTXRULE index types, respectively. Note: because columns with transparent data encryption enabled do not support domain indexes, do not use them with Oracle Text. However, you can create an Oracle Text index on a column of a table stored in a table space with transparent data encryption enabled.
The structure of 2.Oracle Text CONTEXT INDEX
Oracle Text indexes text by converting all words into tags. The general structure of an Oracle Text CONTEXT index is a reverse index, where each tag contains a list of documents (rows) that contain the tag.
For example, after a single initial index operation, the word DOG might have the following entry:
This means that the word DOG is included in the lines that store documents one, three, and five. Merge words and topic indexing by default, Oracle Text uses English and French to index topic information with word information. You can use the ABOUT operator to query subject information. You can also enable and disable topic indexing.
3.Oracle text indexing process
Start the indexing process by creating an Oracle text index of the index using the CREATE INDEX statement, which is organized according to your parameters and preferences.
The indexing process is shown in figure 3-1. This process is a data flow acted by different index objects. Each object corresponds to an index preference type or section group that can be specified in the parameter string of CREATE INDEX or ALTER INDEX.
3.1Datastore objects the stream starts at the datastore because they store documents in the system according to your datastore preferences, so they read them. For example, if the datastore is defined as FILE_DATASTORE, the stream starts by reading the file from the operating system. You can also store documents on Internet or in an Oracle database. No matter where the file is actually located, the text table in the Oracle database must always point to the file. 3.2Filter (filtered) object flows through filter. FILTER preferences determine what happened. You can do this by convection in one of the following ways: when you specify a NULL_FILTER preference type or format column with a value of IGNORE, no filtering occurs. Plain text, HTML or XML documents need not be filtered. When the value of the specified AUTO_FILTER preference type or format column is BINARY, the formatted document (binary) is filtered as tagged text. After the 3.3Sectoner object is filtered, the tagged text passes through the splitter, which divides the stream into text and segmented information. The segmentation information includes the start and end positions of the segment in the text stream. The type of segment extracted is determined by the segment group type. The text is passed to the lexical analyzer. The section information is passed directly to the index engine, which will use it later. The 3.4Lexer object creates lexical analyzer preferences by using an Oracle Text lexical analyzer type to specify the language of the text to be indexed. The lexical analyzer divides text into tags based on your language. These marks are usually words. To extract tokens, the lexical analyzer uses the parameters defined in your lexical analyzer preferences. These parameters include the definition of characters that separate tokens, such as spaces. Parameters also include whether to convert the text to all uppercase or leave it in mixed case. When topic indexing is enabled, the lexical analyzer parses your text to create topic tags for indexing. 3.5 the index engine creates a reverse index that maps tokens to documents that contain tokens. At this stage, Oracle Text excludes stopwords or stopthemes from the index using the stoplist you specify. Oracle Text also uses the parameters defined in the WORDLIST preferences. These parameters tell the system how to create a prefix index or substring index, if enabled. 4. Updates for index columns in versions prior to Oracle Database 12c version 2, if the column on which the Oracle Text index is based is updated, the document is not available for search operations until the index is synchronized. The user query could not search the document. Starting with Oracle Database 12c version 2 (12.2), you can specify that documents must be searchable after updates without immediately performing index synchronization. Before synchronizing the index, the query uses the old index entry to get the contents of the old document. After the index is synchronized, the user query gets the contents of the updated document. The ASYNCHRONOUS_UPDATE option of the index can retain the old contents of the document after the update, and then use the index to answer the user's query. 5. Partitioned tables and indexes when you create a partitioned CONTEXT index on a partitioned text table, you must partition the table by range. Hash, composition and list partitioning are not supported. You can create a partitioned text table to partition data by date. For example, if your application maintains a large library of outdated news articles, you can partition information by month or year. Partitioning simplifies the manageability of large databases because queries, inserts, updates, deletions, and backups and restores can act on a single partition. On local CONTEXT indexes with multiple table sets, Oracle Text supports the number of partitions supported by the Oracle database. To query partitioned tables, use CONTAINS in the WHERE clause of the SELECT statement when querying regular tables. You can query an entire table or a single partition. However, if you use the ORDER BY SCORE clause, Oracle recommends that you query a single partition unless it contains a scope predicate that limits the query to a single partition. 6. In case of clues when the base table cannot be locked for indexing due to an ongoing update, you can use the ONLINE parameter of the CREATE INDEX statement to create the index online. In this way, applications with frequent insert, update, or delete capabilities do not have to stop updating base tables to build indexes. The base table is locked for a short time at the beginning and end of the indexing process. 7. Parallel indexes Oracle Text supports parallel indexes through CREATE INDEX statements. When you enter parallel index statements on a non-partitioned table, Oracle Text splits the underlying table into temporary partitions, generates child processes, and assigns children to partitions. Each child then indexes the rows in its partition. The method of slicing the base table into partitions is determined by Oracle and is not under direct control. The same is true for the actual number of child processes generated, depending on machine functionality, system load, init.ora settings, and other factors. Because of these variables, the actual degree of parallelism may not match the degree of parallelism requested. Because indexing is a dense I / O operation, parallel indexing most effectively reduces indexing time when you have distributed disk access and multiple CPU. Parallel indexes can affect the performance of the initial index only through CREATE INDEX statements. It does not affect insert, update, and delete operations using ALTER INDEX, and has minimal impact on query performance. Because parallel indexes reduce the initial indexing time, it is useful for cases where the data of the product containing Oracle Text indexes is temporarily stored based on the fast initial application test collected by big data, and when you need to test different index parameters and schemas when developing your application. The index and view Oracle SQL standard does not support creating indexes on views. If you need to index documents whose contents are in different tables, use the USER_DATASTORE object to create datastore preferences. Using this object, you can define a process for composing documents from different tables at index time.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.