Your cart is currently empty!
Author: alien
-
Khóa học miễn phí DynamoDB – Indexes nhận dự án làm có lương
DynamoDB – Indexes
DynamoDB uses indexes for primary key attributes to improve accesses. They accelerate application accesses and data retrieval, and support better performance by reducing application lag.
Secondary Index
A secondary index holds an attribute subset and an alternate key. You use it through either a query or scan operation, which targets the index.
Its contents include attributes you project or copy. In creation, you define an alternate key for the index, and any attributes you wish to project in the index. DynamoDB then performs a copy of the attributes into the index, including primary key attributes sourced from the table. After performing these tasks, you simply use a query/scan as if performing on a table.
DynamoDB automatically maintains all secondary indices. On item operations, such as adding or deleting, it updates any indexes on the target table.
DynamoDB offers two types of secondary indexes −
-
Global Secondary Index − This index includes a partition key and sort key, which may differ from the source table. It uses the label “global” due to the capability of queries/scans on the index to span all table data, and over all partitions.
-
Local Secondary Index − This index shares a partition key with the table, but uses a different sort key. Its “local” nature results from all of its partitions scoping to a table partition with identical partition key value.
The best type of index to use depends on application needs. Consider the differences between the two presented in the following table −
Quality Global Secondary Index Local Secondary Index Key Schema It uses a simple or composite primary key. It always uses a composite primary key. Key Attributes The index partition key and sort key can consist of string, number, or binary table attributes. The partition key of the index is an attribute shared with the table partition key. The sort key can be string, number, or binary table attributes. Size Limits Per Partition Key Value They carry no size limitations. It imposes a 10GB maximum limit on total size of indexed items associated with a partition key value. Online Index Operations You can spawn them at table creation, add them to existing tables, or delete existing ones. You must create them at table creation, but cannot delete them or add them to existing tables. Queries It allows queries covering the entire table, and every partition. They address single partitions through the partition key value provided in the query. Consistency Queries of these indices only offer the eventually consistent option. Queries of these offer the options of eventually consistent or strongly consistent. Throughput Cost It includes throughput settings for reads and writes. Queries/scans consume capacity from the index, not the table, which also applies to table write updates. Queries/scans consume table read capacity. Table writes update local indexes, and consume table capacity units. Projection Queries/scans can only request attributes projected into the index, with no retrievals of table attributes. Queries/scans can request those attributes not projected; furthermore, automatic fetches of them occur. When creating multiple tables with secondary indexes, do it sequentially; meaning make a table and wait for it to reach ACTIVE state before creating another and again waiting. DynamoDB does not permit concurrent creation.
Each secondary index requires certain specifications −
-
Type − Specify local or global.
-
Name − It uses naming rules identical to tables.
-
Key Schema − Only top level string, number, or binary type are permitted, with index type determining other requirements.
-
Attributes for Projection − DynamoDB automatically projects them, and allows any data type.
-
Throughput − Specify read/write capacity for global secondary indexes.
The limit for indexes remains 5 global and 5 local per table.
You can access the detailed information about indexes with DescribeTable. It returns the name, size, and item count.
Note − These values updates every 6 hours.
In queries or scans used to access index data, provide the table and index names, desired attributes for the result, and any conditional statements. DynamoDB offers the option to return results in either ascending or descending order.
Note − The deletion of a table also deletes all indexes.
Khóa học lập trình tại Toidayhoc vừa học vừa làm dự án vừa nhận lương: Khóa học lập trình nhận lương tại trung tâm Toidayhoc
-
Khóa học miễn phí DynamoDB – Scan nhận dự án làm có lương
DynamoDB – Scan
Scan Operations read all table items or secondary indices. Its default function results in returning all data attributes of all items within an index or table. Employ the ProjectionExpression parameter in filtering attributes.
Every scan returns a result set, even on finding no matches, which results in an empty set. Scans retrieve no more than 1MB, with the option to filter data.
Note − The parameters and filtering of scans also apply to querying.
Types of Scan Operations
Filtering − Scan operations offer fine filtering through filter expressions, which modify data after scans, or queries; before returning results. The expressions use comparison operators. Their syntax resembles condition expressions with the exception of key attributes, which filter expressions do not permit. You cannot use a partition or sort key in a filter expression.
Note − The 1MB limit applies prior to any application of filtering.
Throughput Specifications − Scans consume throughput, however, consumption focuses on item size rather than returned data. The consumption remains the same whether you request every attribute or only a few, and using or not using a filter expression also does not impact consumption.
Pagination − DynamoDB paginates results causing division of results into specific pages. The 1MB limit applies to returned results, and when you exceed it, another scan becomes necessary to gather the rest of the data. The LastEvaluatedKey value allows you to perform this subsequent scan. Simply apply the value to the ExclusiveStartkey. When the LastEvaluatedKey value becomes null, the operation has completed all pages of data. However, a non-null value does not automatically mean more data remains. Only a null value indicates status.
The Limit Parameter − The limit parameter manages the result size. DynamoDB uses it to establish the number of items to process before returning data, and does not work outside of the scope. If you set a value of x, DynamoDB returns the first x matching items.
The LastEvaluatedKey value also applies in cases of limit parameters yielding partial results. Use it to complete scans.
Result Count − Responses to queries and scans also include information related to ScannedCount and Count, which quantify scanned/queried items and quantify items returned. If you do not filter, their values are identical. When you exceed 1MB, the counts represent only the portion processed.
Consistency − Query results and scan results are eventually consistent reads, however, you can set strongly consistent reads as well. Use the ConsistentRead parameter to change this setting.
Note − Consistent read settings impact consumption by using double the capacity units when set to strongly consistent.
Performance − Queries offer better performance than scans due to scans crawling the full table or secondary index, resulting in a sluggish response and heavy throughput consumption. Scans work best for small tables and searches with less filters, however, you can design lean scans by obeying a few best practices such as avoiding sudden, accelerated read activity and exploiting parallel scans.
A query finds a certain range of keys satisfying a given condition, with performance dictated by the amount of data it retrieves rather than the volume of keys. The parameters of the operation and the number of matches specifically impact performance.
Parallel Scan
Scan operations perform processing sequentially by default. Then they return data in 1MB portions, which prompts the application to fetch the next portion. This results in long scans for large tables and indices.
This characteristic also means scans may not always fully exploit the available throughput. DynamoDB distributes table data across multiple partitions; and scan throughput remains limited to a single partition due to its single-partition operation.
A solution for this problem comes from logically dividing tables or indices into segments. Then “workers” parallel (concurrently) scan segments. It uses the parameters of Segment and TotalSegments to specify segments scanned by certain workers and specify the total quantity of segments processed.
Worker Number
You must experiment with worker values (Segment parameter) to achieve the best application performance.
Note − Parallel scans with large sets of workers impacts throughput by possibly consuming all throughput. Manage this issue with the Limit parameter, which you can use to stop a single worker from consuming all throughput.
The following is a deep scan example.
Note − The following program may assume a previously created data source. Before attempting to execute, acquire supporting libraries and create necessary data sources (tables with required characteristics, or other referenced sources).
This example also uses Eclipse IDE, an AWS credentials file, and the AWS Toolkit within an Eclipse AWS Java Project.
import java.util.HashMap; import java.util.Iterator; import java.util.Map; import com.amazonaws.auth.profile.ProfileCredentialsProvider; import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient; import com.amazonaws.services.dynamodbv2.document.DynamoDB; import com.amazonaws.services.dynamodbv2.document.Item; import com.amazonaws.services.dynamodbv2.document.ItemCollection; import com.amazonaws.services.dynamodbv2.document.ScanOutcome; import com.amazonaws.services.dynamodbv2.document.Table; public class ScanOpSample { static DynamoDB dynamoDB = new DynamoDB( new AmazonDynamoDBClient(new ProfileCredentialsProvider())); static String tableName = "ProductList"; public static void main(String[] args) throws Exception { findProductsUnderOneHun(); //finds products under 100 dollars } private static void findProductsUnderOneHun() { Table table = dynamoDB.getTable(tableName); Map<String, Object> expressionAttributeValues = new HashMap<String, Object>(); expressionAttributeValues.put(":pr", 100); ItemCollection<ScanOutcome> items = table.scan ( "Price < :pr", //FilterExpression "ID, Nomenclature, ProductCategory, Price", //ProjectionExpression null, //No ExpressionAttributeNames expressionAttributeValues); System.out.println("Scanned " + tableName + " to find items under $100."); Iterator<Item> iterator = items.iterator(); while (iterator.hasNext()) { System.out.println(iterator.next().toJSONPretty()); } } }
Khóa học lập trình tại Toidayhoc vừa học vừa làm dự án vừa nhận lương: Khóa học lập trình nhận lương tại trung tâm Toidayhoc