hadoop - Hbase Sorting efficiency -
I have the employee's name "Simon" on line -100 and on line 4000, I have the second name of the same name The employee is "Simon". Now I want to get all employees named "Simon" from my employees' table. The row-key is the SSN of each employee.
My question is, if I name all the employees "Simon". How is the search in Hbase efficient? Since the first name is "Simon" in line 100 and the second "Simon" is in 4000. With the name "Simon" hbase has to find all the table to find the name to get employment by name. Are you scanning a full table in this scenario?
If you have to scan a full table - which you do - it will not be a great measure. In fact, if you have very large rows, then this will be a terrific solution.
To resolve this problem, what is the most relational database management system (or "SQL database")? Indexed Since you are using "NoSQL database", it will not automatically create index for you. Let's see how to create an index manually so that specific types of questions are efficiently adjusted. In relation to a Suppose that one unit If your attribute value is Suppose that To see all the entities with When is not fixed with employee's Department_index table Suppose that To view all entities, type This approach should actually be kept in that case where the number is small in the number of institutions in each row of the index. e in
S
K (E) and an attribute value
V (e) . In addition, assume that your organizations are in an Hibiz table,
E as a line key for each unit in the form of
K (e) for each line.
V , index
s is another table that usually comes in one of three forms is.
Index Form 1
V (E) is also unique to each unit
e . Next to
V , there is a table with index
s one unit per line, where the row key in the table is
V (e ) and
K (e) . Just go to that line to see by
e to
V (E) .
V (E) use this approach. Think of a table of employees
employee where each employee has a unique
employee ID inside the company,
K (e) < / Code>. The main
employee table can use
EmployeeID as a unique key code, and
Employee_SSN_Index employee SSN number
V (e) < / Code> (which is also unique). It provides a fast search of employees by its SSN number.
index form 2
V (E) is not potentially unique
E < / Code>; That is, there can be duplicates, after this there is a table with index
s one unit per line in relation to
V , Where the key to a line of
in the table is V (E) ++ (e) .
e with
V (E) , just start with the
V (E) Lines with a prefix scan
V (E) and it may be impossible to separate the point at which
V (E) Ends and starts with
K (e) . A separator can be placed between the
V (E) and
of (e) in the row key, for example
V (E) ++ "|" ++ (e) In this case, the prefix
V (e) ++ "|" .
can use the DepartmentID an employee attribute value
V (E) .
index form 3
V (e) might possibly be assigned to each unit
E is not unique to; That is, there can be duplicates followed by index
s in relation to
V is a group of organizations in one line, where in the table
K (E) with
V (e) and a column family
F qualifier. That is, the entities are grouped by attributes values in the rows.
V (E) with
E , line
V (e) column
F by requesting all the columns in the family.
Comments
Post a Comment