Blog

BigQuery Admin reference guide: Query optimization

BigQuery Admin reference guide: Query optimization

Last week in the BigQuery reference guide, we walked through query execution and how to leverage the query plan. This week, we’re going a bit deeper – covering more advanced queries and tactical optimization techniques.  Here, we’ll walk through some query concepts and describe techniques for optimizing related SQL. 

Filtering data

From last week’s post, you already know that the execution details for a query show us how much time is spent reading data (either from persistent storage, federated tables or from the memory shuffle) and writing data (either to the memory shuffle or to persistent storage). Limiting the amount of data that is used in the query, or returned to the next stage, can be instrumental in making the query faster and more efficient. 

Optimization techniques

1. Necessary columns only: Only select the columns necessary, especially in inner queries. SELECT * is cost inefficient and may also hurt performance. If the number of columns to return is large, consider using SELECT * EXCEPT to exclude unneeded columns.

2. Auto-pruning with partitions and clusters: Like we mentioned in our post on BigQuery storage, partitions and clusters are used to segment and order the data. Using a filter on columns that the data is partitioned or clustered on can drastically reduce the amount of data scanned. 

3. Expression order matters: BigQuery assumes that the user has provided the best order of expressions in the WHERE clause, and does not attempt to reorder expressions. Expressions in your WHERE clauses should be ordered with the most selective expression first.  The optimized example below is faster because it doesn’t execute the expensive LIKE expression on the entire column content, but rather only on the content from user, ‘anon’.

4. Order by with limit:Writing results for a query with an ORDER BY clause can result in Resources Exceeded errors. Since the final sorting must be done on a single worker, ordering a large result set can overwhelm the slot that is processing the data. If you are sorting a large number of values use a LIMIT clause, which will filter the amount of data passed onto the final slot. 

Understanding aggregation

In an aggregation query, GROUP BYs are done in individual workers and then shuffled such that key value pairs of the same key are then in the same worker. Further aggregation then occurs and is passed into a single worker and served.

Repartitioning

If too much data ends up on a single worker, BigQuery may re-partition the data. Let’s consider the example below. The sources start writing to Sink 1 and 2 (partitions within the memory shuffle tier). Next, the shuffle Monitor detects Sink 2 is over the limit. Now, the partitioning scheme changes and the sources stop writing to Sink 2, and instead start writing to Sink 3 and 4. 

Optimizations

1. Late aggregation: Aggregate as late and as seldom as possible, because aggregation is very costly. The exception is if a table can be reduced drastically by aggregation in preparation for a join – more on this below.

For example, instead of a query like this, where you aggregate in both the subqueries and the final SELECT:

code_block[StructValue([(u’code’, u’SELECTrn t1.dim1, SUM(t1.m1), SUM(t2.m2)rnFROM (SELECT dim1, SUM(metric1) m1 FROM `dataset.table1` GROUP BY 1) t1rnJOIN (SELECT dim1, SUM(metric2) m2 FROM `dataset.table2` GROUP BY 1) t2rnON t1.dim1 = t2.dim1rnGROUP BY 1;’), (u’language’, u”)])]

You should only aggregate once, in the outer query:

code_block[StructValue([(u’code’, u’SELECTrn t1.dim1, SUM(t1.metric1), SUM(t2.metric2)rnFROM (SELECT dim1, metric1 FROM `dataset.table1`) t1rnJOIN (SELECT dim1, metric2 FROM `dataset.table2`) t2rnON t1.dim1 = t2.dim1rnGROUP BY 1;’), (u’language’, u”)])]
2. Nest repeated data: Let’s imagine you have a table showing retail transactions. If you model one order per row and nest line items in an ARRAY field – then you have cases where GROUP BY is no longer required. For example, looking at the total number of items in an order by using ARRAY_LENGTH.

{order_id1, [ {item_id1}, {item_id2} ] }

Understanding joins

One powerful aspect of BigQuery is the ability to combine data, and understand relationships and correlation information from disparate sources. Much of the JOIN syntax is about expressing how that data should be combined, and how to handle data when information is mismatched. However, once that relationship is encoded, BigQuery still needs to execute it. 

Hash-based joins

Let’s jump straight into large scale join execution.  When joining two tables on a common key, BigQuery favors a technique called the hash-based join, or more simply a hash join. With this technique, we can process a table using multiple workers, rather than moving data through a coordinating node.  

So what does hashing actually involve? When we hash values, we’re converting the input value into a number that falls in a known range. There’s many properties of hash functions we care about for hash joins, but the important ones are that our function is deterministic (the same input always yields the same output value) and has uniformity (our output values are evenly spread throughout the allowed range of values).

With an appropriate hashing function, we can then use the output to bucket values.  For example, if our hash function yields an output floating point value between 0 and 1, we can bucket by dividing that key range into N parts, where N is the number of buckets we want.  Grouping data based on this hash value means our buckets should have roughly the same number of discrete values, but even more importantly, all duplicate values should end up in the same bucket. 

Now that you understand what hashing does, let’s talk through joining.

To perform the hash join, we’re going to split up our work into three stages.

Stage 1: Prepare the first table

In BigQuery, data for a table is typically split into multiple columnar files, but within those files there’s no sorting guarantee that ensures that the columns that represent the join key are sorted and colocated.  So, what we do is apply our hashing function to the join key, and based on the buckets we desire we can write rows into different shuffle partitions. 

In the diagram above, we have three columnar files in the first table, and we’ve using our hashing technique to split the data into four buckets (color coded).  Once the first stage is complete, the rows of the first table are effectively split into four “file-like” partitions in shuffle, with duplicates co-located.

Stage 2:  Prepare the second table

This is effectively the same work as the first stage, but we’re processing the other table we’ll be joining data against. The important thing to note here is that we need to use the same hashing function and therefore the same bucket grouping, as we’re aligning data. In the diagram above, the second table had four input files (and thus four units of work), and the data was written into a second set of shuffle partitions.

Stage 3: consume the aligned data and perform the join

After the first two stages are completed, we’ve aligned the data in the two tables using a common hash function and bucketing strategy.  What this means is that we have a set of paired shuffle partitions that correspond to the same hash range, which means that rather than scanning potentially large sets of data, we can execute the join in pieces, as each worker is provided only the relevant data for doing it’s subset of the join.

It’s at this point that we care about the nature of the join operation again; depending on the desired join relationship we may yield no rows, a single row, or many rows for any particular input row from the original input tables.

Now, you can also get a better sense of how important having a good hashing function may be:  if the output values are poorly distributed, we have problems because we’re much more likely to have a single worker that’s slower and forced to do the majority of the work.  Similarly, if we picked our number of buckets poorly, we may have issues due to having split the work too finely or too coarsely.  Fortunately, these are not insurmountable problems, as we can leverage dynamic planning to fix this: we simply insert query stages to adjust the shuffle partitions.

Broadcast joins

Hash-based joins are an incredibly powerful technique for joining lots of data, but your data isn’t always large enough to warrant it.  For cases where one of the tables is small, we can avoid all the alignment work altogether.

Broadcast joins work in cases where one table is small.  In these instances, it’s easiest to replicate the small table into shuffle for faster access, and then simply provide a reference to that data for each worker that’s responsible for processing the other table’s input files.

Optimization techniques

Largest table first:  BigQuery best practice is to manually place the largest table first, followed by the smallest, and then by decreasing size. Only under specific table conditions does BigQuery automatically reorder/optimize based on table size.Filter before joins: WHERE clauses should be executed as soon as possible, especially within joins, so the tables to be joined are as small as possible. We recommend reviewing the query execution details to see if filtering is happening as early as possible, and either fix the condition or use a subquery to filter in advance of a JOIN.Pre-aggregate to limit table size: As mentioned above, aggregating tables before they are joined can help improve performance – but only if the amount of data being joined is drastically reduced and tables are aggregated to the same level (i.e., if there is only one row for every join key value).Clustering on join keys: When you cluster a table based on the key that is used to join, the data is already co-located which makes it easier for workers to split the data into the necessary partitions within the memory shuffle. 

A detailed query: finding popular libraries in Github

Now that we understand some optimization techniques for filtering, aggregating and joining data – let’s look at a complex query with multiple SQL techniques. Walking through the execution details for this query should help you understand how data flows and mutates as it moves through the query plan – so that you can apply this knowledge and understand what’s happening behind the scenes in your own complex query.  

Thepublic Githubdata has one table that contains information about source code filenames, while another contains the contents of these files.  By combining the two together, we can filter down to focus on interesting files and analyze them to understand which libraries are frequently used by developers. Here’s an example of a query that does this for developers using the Go programming language. It scans files having the appropriate (.go) extensions, and looks for statements in the source code for importing libraries, then counts how often those libraries are used and how many distinct code repositories use them.

In SQL, it looks like this:

code_block[StructValue([(u’code’, u’SELECTrn entry,rn COUNT(*) as frequency,rn COUNT(DISTINCT repo_name) as distinct_reposrnFROM (rn SELECTrn files.repo_name,rn SPLIT(REGEXP_EXTRACT(contents.content, rn r’.*import\s*[(]([^)]*)[)]’), ‘\n’) AS entriesrn FROM `bigquery-public-data.github_repos.contents` AS contentsrn JOIN (rn SELECT rn id, repo_name rn FROM `bigquery-public-data.github_repos.files`rn WHERE path LIKE ‘%.go’ GROUP BY id, repo_namern ) AS filesrn USING (id)rn WHERE REGEXP_CONTAINS(contents.content, r’.*import\s*[(][^)]*[)]’)rn)rnCROSS JOIN UNNEST(entries) as entryrnWHERE entry IS NOT NULL AND entry != “”rnGROUP BY entryrnORDER BY distinct_repos DESC, frequency DESCrnLIMIT 1000′), (u’language’, u”)])]

We can see from a casual read that we’ve got lots of interesting bits here: subqueries, both a distributed join (the contents and files tables), array manipulation (cross join unnest), and powerful features such as regular expression filters and computing distinctness.

Detailed stages and steps

First, let’s examine the full details of the plan in a graph format.  Here, we’re looking at the low level details of how this query is run, as a set of stages.  Let’s work through the query stages in detail.If you want a graphical representation similar to the one we’re showing here, check out this code sample!

Stage S00-S01: Reading and filtering from the “files” table

The initial stage (corresponding to the inner subquery of the SQL) begins by processing the “files” table.  We can see the first task is to read the input, and immediately filter that to only pass through files with the appropriate suffix.  We then group based on the id and and repo name, as we’re potentially working with many duplicates, and we only want to process each distinct pair once. In stage S01, we continue the GROUP BY operation; each worker in the first stage only deduplicated the repo/id pairs in their individual input file(s), the aggregate stage here is to combine those so that we’ve deduplicated across all input rows in the “files” table.

Stage S02: Reading in the “contents” table

In this stage, we begin reading the source code in the “contents” table, looking for “import” statements (the syntax for referencing libraries in the Go language).  We collect information about the id (which will become the join key), and the content which has matches. You can also see that in both this stage and the previous (S01), the output is split based on a BY HASH operation.  This is the first part of starting the hash join, where we begin to align join keys into distinct shuffle partitions.  However, anytime we’re dealing with data where we want to divide the work we’ll be splitting it into shuffle buckets with this operation.

Stages S03 – S0A: Repartitioning

This query invoked several repartitioning stages.  This is an example of the dynamic planner rebalancing data as it’s working through the execution graph.  Much of the internals of picking appropriate bucketing is based on heuristics, as operations such as filtration can drastically change the amount of data flowing in and out of query stages. In this particular query, the query plan has chosen a non-optimal bucketing strategy, and is rebalancing the work as it goes.  Also note that this partitioning is happening on both sides of what will become the joined data, because we need to keep the partitioned data aligned as we enter the join.

Stage S0B: Executing the join

Here’s where we begin correlating the data between the two inputs.  You can see in this stage we have two input reads (one for each side of the join), and start computing counts.  There’s also some overloaded work here; we consume the file contents to yield an array representing each individual library being imported, and make that available to future stages.

Stages S0C – S0D: Partial Aggregations

These two stages are responsible for computing our top level statistics:  we wanted to count the total number of times each library was referenced, as well as the number of distinct repositories.  We end up splitting that into two stages.

Stage S0E-S0F: Ordering and limiting

Our query requested only the top 1000 libraries ordered first by distinct repository count, and then total frequency of use.  The last two stages are responsible for doing this sorting and reduction to yield the final result.

Other optimization techniques

As a final thought, we’ll leave you with a few more optimization techniques that could help improve the performance of your queries.

Multiple WITH clauses: The WITH statement in BigQuery is like a Macro. At runtime the contents of the subquery will be inlined every place the alias is referenced. This can lead to query plan explosion as seen by the plan executing the same query stages multiple times. Instead, try using a TEMP table.String comparisons: REGEXP_CONTAINS can offer more functionality, but it has a slower execution time compare to LIKE. Make LIKE when the full power of regex is not needed (e.g. wildcard matching): 

regexp_contains(dim1, ‘.*test.*’) to dim1 like %test%

First or last record: When trying to calculate the first or last record in a subset of your data, using the ROW_NUMBER() function can fail with Resources Exceeded errors if there are too many elements to ORDER BY in a single partition. Instead, try using ARRAY_AGG()– which runs more efficiently because the ORDER BY is allowed to drop everything except the top record on each GROUP BY. For example, this:

code_block[StructValue([(u’code’, u’select rn * except(rn)rnfrom (rn select *, rn row_number() over(rn partition by id rn order by created_at desc) rnrn from rn `dataset.table` trn)rnwhere rn = 1′), (u’language’, u”)])]

Becomes this:

code_block[StructValue([(u’code’, u’select rn event.* rnfrom (rn select array_agg(rn t order by t.created_at desc limit 1rn )[offset(0)] eventrn from rn `dataset.table` t rn group by rn idrn)’), (u’language’, u”)])]

See you next week!

Thanks again for tuning in this week! Next up is data governance, so be sure to keep an eye out for more in this series by following Leigha on LinkedIn and Twitter.

Related Article

BigQuery Admin reference guide: Query processing

BigQuery is capable of some truly impressive feats, be it scanning billions of rows based on a regular expression, joining large tables, …

Read Article

Source : Data Analytics Read More

Introducing Active Assist recommendations for BigQuery capacity planning

Introducing Active Assist recommendations for BigQuery capacity planning

BigQuery already offers highly flexible pricingmodels, such as the on-demand and flat-rate pricing for running queries, to meet the diverse needs of our users. 

Today, we’re excited to make it even easier for you to optimize BigQuery usage with new BigQuery slot recommendations powered by Active Assist, a part of Google Cloud’s AIOps solution that uses data, intelligence, and machine learning to reduce cloud complexity and administrative toil. 

This feature is available in preview to all BigQuery customers who run queries using on-demand pricing but are exploring flat-rate pricing. It helps answer questions like “How many slots do I need?” based on your BigQuery usage history, spend, and other signals. With actionable and automatic insights and recommendations, you can easily understand the cost benefits and performance tradeoffs of switching to the longer-term reservations for a given project or organization. 

You can access these recommendations in:

Slot Estimator in the Capacity Management section within BigQuery Administration,

Recommendation Hub,

Recommender API and BigQuery export of recommendations to make it easy for you to integrate with your company’s existing workflow management tools.

How many BigQuery slots do I need?

As a reminder, BigQuery uses slots (virtual unit of compute and memory) to run your queries. With the on-demand pricing, you don’t need to think about pre-provisioning slots. However, depending on the characteristics of a given workload, monthly flat-rate pricing could be a more cost effective option compared to on-demand pricing. 

If you are exploring flat-rate pricing, choosing the optimal number of slots to purchase becomes an important question. You don’t want to buy too many and end up with idle/unused capacity, and you don’t want to buy too few and end up not meeting your query performance requirements. The answer to “How many slots should I buy” question would depend on your requirements for performance, throughput, and utility. Some key things to consider:

Type of the workload and its tolerance to a potential query performance impact

Shape of the workload (e.g. predictable/unpredictable pattern, spikes) 

The historical slot usage of your projects

Your desired monthly budget

Until now, you would have to gather empirical data with slot purchases at different levels and analyze the current slot usage of your projects with a given workload using Cloud Logging or INFORMATION_SCHEMA to estimate how many slots to purchase. Today, we are making this a whole lot easier with the BigQuery slot recommendations.

Discovering and acting on BigQuery slot recommendations

BigQuery Slot Recommender analyzes your BigQuery usage activity across all projects under your organization during the last 30 days. The recommender analyzes your BigQuery on-demand slot usage and presents cost/performance tradeoffs in percentiles. For example, if a project used 1500 on-demand slots at the 95th percentile, it means that it used less than 1500 slots over 95% of the time, and if you switch to the corresponding monthly commitment of 1500 slots then you might see a reduced query performance 5% of the time, given no substantial changes to the characteristics of your workload.

For projects and organizations that can benefit from a longer-term monthly commitment, it can generate recommendations to consider switching from the on-demand to monthly slots. Recommendations include a range of options along the spectrum of optimizing for cost or performance and associated cost savings estimates. Here’s what an example recommendation looks like in Cloud Console UI:

Example slot recommendations under BigQuery Capacity Management

This particular example recommendation provides you with the following alternative options:

Cost optimized (90th percentile): If you switch to a monthly commitment of 5000 slots, 10% queries may see reduced performance with estimated monthly savings of $60,000.

Balanced (95th percentile): If you switch to a monthly commitment of 6200 slots, 5% queries may see reduced performance with estimated monthly savings of $36,000.

Performance optimized (99th percentile): If you switch to a monthly commitment of 8500 slots, 1% queries may see reduced performance with estimated monthly savings of $10,000.

Cover all usage (100th percentile): This option will not save you any money or might cost more. Please note that in such cases BigQuery Slot Recommender shows “No cost savings” during the preview, and we are working on providing you with the “negative cost” estimates as a part of our next iteration on this product.

In addition to these recommendations, you can also examine your usage data presented as a time series. For example, you might have a spike at a particular time every day that accounts for all your slot usage above the 95th percentile that could be potentially augmented by purchasing flex slots or scheduling queries at different times, to mitigate any potential performance impact.

Getting started with the BigQuery Slot Recommender

To get started with the slot recommendations, check Slot Estimator, BigQuery’s interactive capacity management tool, under the BigQuery Capacity Management in Google Cloud Console (check out this blog to learn more about Slot Estimator). To view the recommendations, you will need the appropriate IAM role such as BigQuery Resource Admin or individual set of permissions for BigQuery Slot Recommender, BigQuery Slot Estimator, and to view resources in a given organization.

You can also automatically export the recommendations to BigQuery and then investigate slot recommendations with DataStudio or Looker. Or, you can use Connected Sheets to use Google Workspace Sheets to interact with the data without having to write SQL queries.

As with any other Recommender, you can choose to opt out of data processing for your organization or your projects at any time by disabling the appropriate data groups in the Transparency & Control tab under Privacy & Security settings.

We hope you use BigQuery Slot Recommender to optimize your usage of BigQuery, and can’t wait to hear your feedback and thoughts about this feature! Please feel free to reach us at active-assist-feedback@google.com. We also invite you to sign up for our Active Assist Trusted Tester Group if you would like to get early access to new features as they are developed.

Related Article

Introducing Active Assist: Reduce complexity, maximize your cloud ROI

Introducing Active Assist, a family of tools to help you easily optimize your Google Cloud environment.

Read Article

Source : Data Analytics Read More

Pros and Cons of Having a Data-Driven eCommerce Business on Amazon

Pros and Cons of Having a Data-Driven eCommerce Business on Amazon

There are many ways that you can use big data to create a profitable business. One of the smartest ways for entrepreneurs to utilize data is by creating an ecommerce business.

You can run a profitable ecommerce business through Amazon. SellerApp author Dilip Vamanan wrote a great article on the merits of using data analytics as an Amazon seller. However, you might want to also consider other ecommerce platforms for your data-driven ecommerce business.

Using Data Analytics to Create a Successful Business on Amazon

So today, you have decided that it is time to change something in your life. Go to the gym, learn to play the guitar, or even learn a new business that will be both interesting and profitable. Take selling products on Amazon, for example.

Making purchases via Amazon.com is something most of us are familiar with. Still, we are a lot less knowledgeable when it comes to selling on Amazon. So if you are considering whether running a business on Amazon.com can be a highly profitable idea, you may want to keep reading. You will learn how to use data analytics to make the most of your efforts.

The first thing you might decide is to find out if there are any pitfalls with creating a data-driven Amazon business and if the advantages offered outweigh them. That’s what we will cover in this post. So, keep on reading.

Pros of Selling on Creating a Data-Driven Business as an Amazon Seller

There are some awesome benefits of creating a business on Amazon. You can leverage these benefits even more by utilizing big data. Here are some ideas to take into consideration.

1. Simplifying the logistics

Amazon has a great interface that automates a lot of processes for sellers. Sellers that understand data analytics can get even more out of this interface.

For instance, you can enroll in the Fulfillment by Amazon (FBA) program and free yourself from multiple tasks at once. Thus, the platform will store your products in its warehouses and take care of logistics and shipping. Moreover, this service deals with returns. You have to correctly fill out the product listing on the website, ensure that the items are always in stock, and engage in advertising.

If you take advantage of the data analytics capabilities that Amazon provides, you can streamline even more of the processes. For example, you can use any Amazon research tool that will simplify product listing monitoring, competitor analysis, and protection against hijackers. It is constantly accumulating more data on customers and sellers using their platform, so you can make more informed decisions.

2. Use analytics to reach customers with a high level of intent

As of 2022, Amazon.com is the most popular e-commerce platform in the US, with two billion visits every month. The second place went to eBay with approximately 689 million impressions, followed by Walmart with 389 million. Furthermore, these people do not just look through the items; they are ready to buy. For instance, the number of customers who chose the Prime plan from the fourth quarter of 2019 to the first quarter of 2021 increased from 150 million to 200 million. On average, they spend about $1,400 on the website, and 48% of these users buy something weekly.

You can use data analytics to reach customers that are more eager to make a purchase. SEO tools like Ahref collect data on Amazon search activity, which you can use to estimate the percentage of buyers ready to make a purchase rather than reaching random users. This is a benefit that other ecommerce platforms don’t offer.

3. Amazon uses big data to boost UX

Big data is a valuable part of user experience optimization. Amazon takes advantage of this, which is one of the reasons they have such a great site.

A survey of over 2,000 US customers found that 89% of shoppers are more likely to buy items on Amazon than other e-commerce platforms. It is one of the reasons why the conversion on this marketplace is higher than on others. It is worth noting that many people choose Amazon as their main place to find products. 63% of buyers start their search here, and 82% visit the platform regularly to compare prices.

4. AI helps you run the business remotely

You don’t have to rent an office or hire a huge team to trade on Amazon, especially if you’re just getting started. All you need is access to the Internet, which means you can manage all processes from anywhere in the world. It also gives you freedom in your schedule. According to recent research, 54% of Amazon sellers successfully combine trading with another job.

You can manage things even easier if you take advantage of AI. AI allows you to automate many processes as an Amazon seller. Data Driven Investor has a list of some great AI tools that Amazon sellers can use to automate processes, such as Feedback Five.

5. AI provides constant improvements on the platform

We can safely say that Amazon is continuously evolving and setting trends in e-commerce worldwide. The ecommerce giant is regularly using AI to update its platform and remove bugs. If you are aware of what is happening on Amazon, it is much easier for you to become a leader on other local marketplaces.

Drawbacks of Selling via Amazon as a Data-Driven Business

As much as we’d like to continue the list of pros, there are a few challenges you may encounter on Amazon, even if you already have e-commerce experience. Big data can help resolve some of these issues, but it won’t entirely eliminate them.

1. Initial investments

According to experienced sellers, you need to invest about $20,000 to fully launch a new product under your brand. At the same time, there are cases when $5-10,000 was enough. You need to remember that most of these funds are spent not on the purchase of goods but on their promotion.

You are going to have to spend even more if you want to create a data-driven business on Amazon. There are a lot of data analytics and AI tools that Amazon sellers can leverage, such as Jungle Scout, Helium 10, Ahrefs and AMZScout. However, these tools are not free and will add to your startup costs.

2. High competition

As the platform’s popularity among buyers is high, the number of sellers is constantly growing. Now their number has reached 9.7 million, of which 2 million are permanent and active. Therefore, to get on the first page and make yourself known, you will need to put in a lot of work and invest money in advertising and other promotion methods. You will have an edge if you understand big data, but you are going to still be competing against a lot of other sellers. Data-driven sellers might have an easier time standing out on other ecommerce platforms.

3. Complicated registration

Few people warn that troubles with Amazon can begin long before you start selling anything. Be sure to study all the registration guidelines to sign up for your account and list the goods properly.  

4. Permanent account suspensions

Unfortunately, the word “permanent” is not an exaggeration here. Amazon can suspend your account and require documents at almost any stage (from verifying your identity to providing invoices for goods). Sometimes the platform goes too far, and even if you have all the required paperwork, it can take a long time to get your access back.

Also, there are mass bans several times a year, which are often unjustified. Thus, you need to be prepared that you will need to contact support every time, and no one will make up for the loss of money. Most often, such blockings occur due to new security measures. Unfortunately, some sellers do indeed forge documents, sell low-quality goods, violate intellectual property rights, etc.

Should Data-Driven Sellers Use Amazon or Another Marketplace: Final Thoughts

Should ecommerce sellers with a background in big data use Amazon or another platform? It’s up to you. Every Amazon seller might face the above-mentioned pros and cons and they will have an easier time if they use data analytics properly. However, each experience differs depending on the selling strategy, product niche, and method of selling (FBA, FBM, etc.) Our opinion: if you invest enough work, approach promotion wisely, and are not afraid to solve problems (which can arise in absolutely any business), then your chances of success are very high. Anyway, trading on Amazon is an exciting and profitable process that expands your horizons.

The post Pros and Cons of Having a Data-Driven eCommerce Business on Amazon appeared first on SmartData Collective.

Source : SmartData Collective Read More

Social Analytics Tools Are Crucial for Successful Instagram Marketing

Social Analytics Tools Are Crucial for Successful Instagram Marketing

Analytics-driven businesses need to be prudent about investing in the right technology. This is especially true when they are trying to come up with a sensible marketing strategy.

Social analytics is a very important part of marketing. You can use social analytics tools to create an effective marketing campaign on Instagram, Facebook or other social media platforms. It is no wonder that companies around the world are spending $9.5 billion on social analytics this year.

We could write entire articles on using social analytics for each social network, so this one will focus on Instagram. Keep reading to learn how to make the most of social analytics for Instagram.

Social Analytics is the Key to a Successful Instagram Strategy

Instagram is a platform where how you represent what you do has great importance. There has to be a balance between creativity and technicality to excel here. You miss out on one, and it is not going to work out. Now, the creativity is in your hand. Nobody can teach you about that.

But for the other aspect, you have to not only look after your representation but also keep a check if your posts are reaching people.

Social analytics tools can help you stand out. You can use social analytics tools to optimize your Instagram profile and improve your marketing strategy on this social network.

The whereabouts of your profile have to always be at your fingertip, and you have to keep experimenting and seeing what works the best for you.

That is when the talk of Instagram Analytics comes to the scene. We talked about using AI in Instagram marketing but analytics is just as important.

What is Instagram Analytics?

Instagram Analytics is the way you understand the pattern of your posts and account and keep a check on your performance.

This is how you will know if things are going smoothly in your profile. You can also review factors that have helped you gather engagement or disturbed the same.

The patterns will give a clear depiction or detailed overview of every factor from top to toe.

On the platform, we have Instagram insights that show you the analytics of everything. The profile visits of your account or anything that gets you traffic will be available through this.

This feature is available only for business accounts or accounts with tremendous reach.

How to Access Insights on Instagram?

Source: Pinterest

The following steps will take you to the insights of your Instagram.

Tap on the hamburger icon in the upper right corner of your profile.

Go to Insights

You will have three tabs- Activity, Content, Audience

 Tap on any of those

In that regard, you will also have paid tools that can serve your purpose and help you out.

Other Instagram Analytics Tools

Some paid tools give you deeper insights. If your business demands the same, you can look for that as well.

Some of those are:

Hootsuite

Hootsuite is a flexible tool that comes with a training video. Subscribing to the same gives you a detailed depiction of everything you can need from an analysis.

It comes with a training video. Once you get familiar with that you are good to go.

Socialbakers

Source: Freepik

Socialbakers is designed taking businesses into account. Along with giving vigorous analysis, it also provides suggestions for improvement.

In addition to that, it also shows you the whereabouts of your competition and gives you an idea of where you stand.

Social Sprout

Source: Pinterest

Social Sprout also gives an in-depth analysis of your profile. 

Adding to that it also gives access to scheduling your post, videos, and photos to post for starters and the provision for customization.

Importance of Instagram Analytics

Since we are in the discussion on analytics of Instagram, there has to be some utmost need for the same for the value it is receiving.

Let us talk about that now.

Perform Your Best

While you are on Instagram the tendency is always to perform better than your competitor.  So in that process, you stress over your content and try to bring out the best you can.

No doubt this is one of the best ways you can make a mark. But, unfortunately, this is not enough. Not enough in the sense that if you cannot make it reach your target audience, you will still be considered underperforming.

Your content can help you out after it has reached people. Now, for the reach, you will have to make sure of a lot of things.

From hashtags, captions to scheduling your posts, you have to take care of everything.

Now, what is the outcome of the same? If the things you are adapting or you have to look for other options, for the clarity of the same you have to tune in to analytics.

If you do that religiously, then you are sure to overcome most of your shortcomings, if not all, and that will definitely take you to the top of your industry.

If for some reason, your analytics are still not up to the mark, buying real like on MegaFamous is a great option.

Understand Your Audience

ource: Freepik

If you subscribe to a more defined analytics tool, you have access to your audiences’ demographics as well.

You will not only get to know who is actually engaging with your content, but also have a detailed overview of the type of people who are interested in consuming your content.

For example, if you are in the business of handmade jewelry. Your followers’ list might have more women in their 20s and 30s.

If your analytics also show the same result, you can take the same into consideration and work on that.

This may sound like an insignificant detail.

But once you get to know your audience, your mind will automatically start to brainstorm about it and you will surely find something or the other that will be a favor for your business.

Prevent Yourself From Tiresome Work

If something does not work out, you try to look for ways to make it better. Now, on Instagram, if you do that by trial and error method, it is going to be ridiculously tiresome.

You will also have to spend a great amount of money and time if the situation is not resolved.

On the other hand, analytics is available to you at the tip of your finger

That is why it is advisable to keep a look at it regularly so that you cut short issues that can enlarge with time.

Social Analytics is Crucial for Succeeding on Instagram

As the topic here is Instagram Analytics, we have covered the same and its importance in detail.

By now you will already have an idea on how this is not a choice but a need if you are into Instagram for business.

If we may say, the importance of analytics is still restricted which will enhance more in the coming days. 

Instagram itself or the paid tools as well will bring more and more options for you to make your work even easier.

The post Social Analytics Tools Are Crucial for Successful Instagram Marketing appeared first on SmartData Collective.

Source : SmartData Collective Read More

Accountants Are Using Machine Learning to Boost Efficiency

Accountants Are Using Machine Learning to Boost Efficiency

Machine learning technology is changing many sectors in tremendous ways. The accounting sector is no exception. Analysts from Markets and Markets project that the market for AI in the accounting industry will exceed $4.7 billion within the next two years.

A lot of accountants are discovering innovative ways to take advantage of the benefits of machine learning. They have found that AI technology can help boost efficiency, reduce errors and improve customer satisfaction.

Machine Learning is a Huge Boon to the Accounting Sector

Accountants are an innovative and successful bunch since there’s a lot more to the profession than just number crunching. However, working as an accountant in a company and running your own accounting firm are two very different roles. Experienced accountants do indeed have a better understanding of core business than most. Still, there are several other aspects of business management that they might not be trained or prepared to handle.

Consequently, this knowledge gap can affect the company’s efficiency unless the necessary steps are taken to prepare a counter-strategy. The good news is that you can reduce the issues that you will experience by taking advantage of machine learning technology. Smart accountants also recognize the need to leverage data science in their profession.

In the coming paragraphs, we will discuss a few tips for boosting efficiency in accounting firms with AI, as suggested by some of the most successful names in the sector.

Streamlining Workflow with Machine Learning

We are not saying that accountants are unaware of the importance of workflow management in business. Still, even the most qualified accountants are not always trained to be business leaders capable of implementing the necessary steps. The good news is that there are ways to improve workflows with AI.

As for what steps can be taken to maximize productivity and improve workflow management at an accounting with AI, consider the following tried and tested suggestions:

Identify all business processes (the work) and rank them in accordance with their necessity and value to the firm. Machine learning technology can improve workflows and help you assign a weight to the importance of different tasks. You might subjectively rank things in a certain order, but machine learning algorithms can be trained to tell how much value should be attributed to a certain function. Define, designate, and delegate job roles for your workforce to handle the identified and ranked processes accordingly. Machine learning technology can help in this area by determining the skills different employees have in handling different tasks. It can accomplish this in a number of ways, such as reviewing past performance reports and error rates on certain projects to figure out which employees you should delegate to. Provide employees with access to the appropriate AI-driven tools, so that they can become more productive with their time.. Many new software applications use AI, such as SaaS and specialized financial modeling tools. However, AI tools are only useful if you make sure employees have access to them. Invest in a digital work map/project management software, so that both progress and bottlenecks always remain transparent. Digital work maps use AI technology to find the most efficient use of resources.

Use Machine Learning Strategically and Create Provisions to Manage and Mitigate the Impacts of Liability Lawsuits

The best way to minimize your firm’s chances of getting sued is never to make a single mistake. Although that is what every accounting firm should aspire to, it would be completely irrational to expect that for obvious reasons. A single mistake, a minor oversight, or sometimes, just plain bad luck can cause severe damage to a client’s finances.

Consequently, the client will hold you accountable for the mistake, which in turn can have severe negative impacts on your accounting company’s reputation, finances, and focus. The Hartford explains how accountants professional liability insurance can help accounting firms manage and nullify most of these negative impacts, even before they become serious problems. When an accounting firm has professional liability insurance, it means that the insurer will either:

Compensate the affected client, ensuring that they will no longer be able to sue the client accounting firm upon accepting the agreed compensation deal

Or,

Pay for legal expenses, should it become necessary for a client firm to defend themselves against an unavoidable lawsuit

Note that the best accountants professional liability insurance policies will cover almost all accounting mistakes related to misinterpretation, inaccuracy, and even negligence on the client firm’s part.

In addition to creating a provision, you should try to use AI technology to automate certain tasks that are prone to human error. This will help reduce the risk of costly mistakes. You can also use AI tools to review work for errors.

Choose Gradual Digitization

Digitize every aspect of your accounting firm that can be digitized but do so gradually. Not every accountant in your firm will be up to date with the latest software tools, so give them the time to get themselves acquainted with new tech. Overwhelming the workforce with rapid changes can and often does make the whole process of digitizing or updating an accounting firm’s business process a counterproductive approach. Instead, introduce tech at a consistent but gradual pace, supplemented with the necessary training to operate the software when needed.

ML is Key to Improving Efficiency in the Accounting Sector

The accounting industry is adapting in response to advances in AI technology. More accounting companies are using machine learning to address some of the most pressing challenges facing the industry.

The post Accountants Are Using Machine Learning to Boost Efficiency appeared first on SmartData Collective.

Source : SmartData Collective Read More

Enhance your analysis with new international Google Trends datasets in BigQuery

Enhance your analysis with new international Google Trends datasets in BigQuery

Sharing and exchanging data with other organizations is a critical element of any organization’s analytics strategy. In fact, BigQuery customers are already sharing data using our existing infrastructure, with over 4,500 customers swapping data across organizational boundaries. Creating seamless access to analytics workflows and insights has become that much easier with the introduction of Analytics Hub and surfacing datasets unique to Google.

Last summer, the Google Trends public dataset was launched to democratize access to Google first-party data and drive additional value to our customers. At no additional cost, you can access Top 25 stories and Top 25 Rising search queries in the United States through a SQL-interface, unlocking countless new opportunities to derive insights from blending Google Trends datasets with other structured data sources. Since launching in June of 2021, over 30 terabytes of the Google Trends dataset have been queried by users across the United States. 

From joining the Search Trends data to Nielsen Designated Market Area (DMA) boundaries to know where to activate marketing campaigns, to creating term forecasts and predictions to hypothesize and experiment product development, there are a broad range of applications across many business and consumer profiles.  Through the secure and streamlined access to this highly desirable data in BigQuery, business and consumers alike are finally able to make better data-driven decisions at scale.

With the success of the Google Trends dataset launch in the United States, we knew that meeting the needs of our global counterparts would be a fast follow. After all, we are citizens of a global economy and must do better to accommodate the world we operate in. As such, we began our journey to provide a more comprehensive view of how trends occur across the globe for our customers.

What’s new?

Today, we are excited to announce the expansion of the Google Trends public dataset beyond the US to cover approximately 50 additional countries worldwide. This is available in public preview and covers all major countries where the Google Trends service exists today. Most of the features of the international Google Trends dataset will mimic its United States counterpart, backed by the same privacy-first mindset. 

The international dataset will remain anonymized, indexed, normalized, and aggregated prior to publication. New sets of top terms and top rising queries will continue to be generated daily, with data being inserted into a new partition of their respective table. The expiration date of each top term and top rising set (e.g. each set’s partition) will also stay at 30 days. Every term within a set will still be enriched with a historical backfill over a rolling five year period. Learn more about the schema of each table in the dataset listing.

In addition to surfacing the top trends in the United States by Designated Market Area (DMA), the international dataset will provide the daily top stories and top rising queries by ISO country and sub-region. Countries and/or sub-regions may be excluded based on data-sharing regulation and policies. The sheer scale of coverage and reach now increases multi-fold by simply applying similar or existing use cases to different parts of the globe.

Working with the international Google Trends dataset

Just like all other Google Cloud datasets, users can obtain access without charges of up to 1TB/month in queries and up to 10GB/month in storage through BigQuery’s free tier and leverage the BigQuery sandbox, all subject to BigQuery’s free tier thresholds.

To begin exploring the global Google Trends dataset, simply query the international tables for the top 25 and top 25 rising terms from the Google Cloud Console. To minimize the data scanned and processed, utilize the partition filter, as well as country and region filters (if possible) in your query:

code_block[StructValue([(u’code’, u’SELECTrn *rnFROMrn `bigquery-public-data.google_trends.interational_top_terms`rnWHERErn refresh_date = DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)rn AND country_code = u2018CAu2019rn AND region_name = u2018Albertau2019′), (u’language’, u”)])]

Sample data:

We’ve also updated the Looker dashboard to incorporate the new global dataset, and it even includes filtering for the countries and regions you care about most.

What’s next for Google Cloud Datasets?

We are continuing to progress forward in the path to making Google’s first-party data universally accessible. Stay tuned for updates on more dataset launches and availability, as well as our integration with Analytics Hub. In the meantime, explore the new international Google Trends dataset in your own project, or if you’re new to BigQuery spin up a project using the BigQuery sandbox.

Related Article

Top 25 Google Search terms, now in BigQuery

Google Trends datasets for the Top 25 terms and Top 25 Rising terms now available in BigQuery to enhance your business analyses

Read Article

Source : Data Analytics Read More

8 Steps to Leveraging Analytics to Create Successful Ecommerce Stores

8 Steps to Leveraging Analytics to Create Successful Ecommerce Stores

Analytics technology is taking the ecommerce industry by storm. Ecommerce companies are expected to spend over $24 billion on analytics in 2025.

While there is no debating the huge benefits that analytics technology brings to the ecommerce sector, many experts are pondering what those actual benefits are. New ecommerce startups are discovering interesting ways to utilize analytics. Those that have a solid strategy predicated on it will have a higher ROI.

How Can Your New Ecommerce Startup Take Advantage of Analytics Technology?

You will have a huge competitive edge in the ecommerce market if you leverage analytics to your fullest potential.

But how do you go about dong this? You can figure out how to take the online market for your goods and services by storm by following our guide to creating an e-commerce store! As shoppers continue to buy more of the things they need on the internet, tens of millions of merchants have migrated online to meet demand. Companies that know how to leverage analytics will have the following advantages:

They will be able to use predictive analytics tools to anticipate future demand of products and services.They can use data on online user engagement to optimize their business models.They are able to utilize Hadoop-based data mining tools to improve their market research capabilities and develop better products.

Companies that use big data analytics can increase their profitability by 8% on average. However, ecommerce companies can benefit even more, because they have access to more data that they can leverage.

Keep reading to discover how you can build the next big online retailing company with our step-by-step guide to building a successful analytics-driven e-commerce shop.

Step #1 — Use Analytics to Select the Right Name

Decide on a company name in the early stages of your business process so you can use it on applications and other forms. As you think of new names, check their availability on web hosting sites.

Analytics technology can help you find the right name for your business. There are detailed databases of business names that you can use for inspiration and avoid trademark issues. The USPTO trademark database search is a good start, but you can also search domain registration records.

Some of the most popular web hosting sites with databases on registered domain names include:

Google DomainsGoDaddyBluehostGreenGeeksHostGator

In addition to having databases of existing domains, they use machine learning to suggest new names for your business. These algorithms are getting better all the time.

If you discover that someone has already claimed your preferred domain name, consider altering it or choosing a different company name altogether. You can always change your company name later; however, it will require significant time and money to rebrand. The company name that you finally settle on should be:

PositiveMemorableRelevantUnambiguousTrademark-able

Step #2 — Develop an Analytics-Based Financial Management Strategy

Analytics is also incredibly important for managing your company’s finances. You can use data analytics for everything from finding the right bank account to lowering your expenses and ensuring you don’t miss any deductions around tax time.

Many people don’t know the difference in banking services offered by different types of banks. It can be confusing to know which bank offers what you need. The business bank category has its own set of characteristics that separate it from traditional banks. For example, checking or savings accounts are often limited in number and charge more when they are frequently accessed. A business bank may offer more lending options to potential customers, making them a popular choice for entrepreneurs looking for funding for their startup company.

Choosing the right bank for your business is an important decision that will affect your company’s bottom line. The best way to do this is to evaluate what features are most important to you and then pinpoint where they overlap. Cashback, for example, might be an attractive offer if you’re open to changing accounts, especially if you put a lot of business expenses on your debit card. What you need is a bank that offers both cashback and the other services that are the most valuable to your business. You can consider options such as Nearside which make it easy to sign up online, offer cashback and doesn’t cost anything to sign up for.

You want to make sure that you can easily integrate financial analytics tools with your bank account. This is going to make it a lot easier to optimize your finances, such as identifying unnecessary recurring expenses and taking advantage of all possible deductions when you file taxes.

Step #3 — Select an E-commerce Platform with a Great Analytics Dashboard

Choose an e-commerce platform to build your company website around. You want to make sure that it has a great analytics dashboard, which is going to make it easier to optimize your business. Ideally, you will also be able to integrate tools like Google Analytics to make more nuanced insights. Smart owners will select their e-commerce platforms first to ensure optimal capability for their company websites.

Some of the best e-commerce platforms with great analytics dashboards include:

Squarespace — Ideal for online brochures and portfoliosWooCommerce — High-quality WordPress plug-in3dcart — One of the most affordable e-commerce platformsWeebly — Easy-to-use website builderMagento — Versatile e-commerce solutionWordPress — The most well-known website builderBigCommerce — Offers the most included featuresShopify — The most complete e-commerce optionWix — Free solution for hobby websites

The e-commerce platform that you choose will determine the future of your startup. Some of the questions that you may wish to ask yourself include:

How big do I want to grow my business?Will this platform accommodate significant growth?What kind of tech support do I want?How many plug-ins will I need?What are the short-term and long-term costs?

Step #4 — Use Analytics to Build and Maintain the Best E-commerce Website

As we stated in the past, big data is invaluable to developing websites. You should also have an analytics system in place to create the best possible website for your company. Make sure to read some of our previous guides on this process.

When you take an analytics-based approach to web development and website management, you can hook your customers with a dazzling company website that motivates them to act. Whether you choose to build your own website or hire a web design company for a custom site, you’ll need to impact the minds of visitors within seconds. Some of the things you will need to create your new website include:

Logo and other branding materialsProduct photos and descriptionsHomepage contentCustomer service pageCompany historyContact page informationSite map

Aside from descriptions and photos, you might want to include other information for your products and services. If you have a lot of listings, you may also need:

VideosSKUsPricingSpecificationsInventory

Step #5 — Choose a payment processor and shipping option

Select a secure payment processor so you can start collecting money and setting aside taxes! If you already have an existing business and wish to use your current merchant account and payment gateway, you can do that. Otherwise, you will need to choose a third-party payment processor unless you select an e-commerce platform with a built-in payment system.

Some of the most popular payment system plug-ins include:

SquarePayPalStripeWooCommerceAuthorize.Net

Simplify your order fulfillment processes by integrating the best shipping software with your company website. If you choose Shopify, WooCommerce, or BigCommerce as your e-commerce platform, shipping support comes included. Otherwise, you will need to find the best shipping app for your website, which could cost you a small monthly fee.

Startup businesses need shipping integration because it automatically:

Selects the shipping companyChooses the shipping methodPrints labelsProvides tracking

Once you have your company website set up, you can test it a few times before the official launch.

Step #6 — Launch your new e-commerce website

On Launch Day, you want everything to run smoothly. To increase the chances of a successful launch, you should execute a few test purchases. Your payment processor should offer a mode in which you can test a purchase. These tests check the functionality of your payment system without charging your credit card or debit card.

Consider testing the purchase of different products and services at least a few times. Once you feel satisfied that your payment processor will work, your e-commerce website will be ready for launch. It will take at least a few weeks before Google web crawlers find, analyze, and index your website. In the meantime, you may wish to pay for advertising rather than rely on organic traffic.

Step #7 — Use Analytics to Improve Your Marketing Strategy

Now that you have finished all the hard work required to start a new e-commerce business, you cannot just sit there and expect everything to happen all by itself. You need to put your company out there. Thankfully, marketing your business today proves easier than ever. You will encounter no shortage of apps, tools, and techniques to put your logo in front of consumers’ eyes.

While your homepage, product pages, and search engine optimization will serve as the backbone of your future marketing campaigns, you will need to actively target customers online. Some of the ways that you can reach your key demographics include paying for:

Google Ads to place your website at the top of search engine results pagesAds on Twitter, Facebook, Instagram, TikTok, and other social media platformsQualified leads to add to your email list for an email blast

Any of these approaches are going to be a work in progress, so you have to take advantage of analytics to make the most of them. You will want to regularly review your data and tweak your marketing strategy. If you have your marketing analytics platform setup properly, you can focus on the best performing keywords in your PPC campaigns, identify the best converting landing pages and make sure you are reaching the right demographics with other marketing strategies.

Step #8 — Profit

You might find yourself waiting for a year or more before your company starts to turn a profit. Indeed, many of the most valued public companies do not turn a profit for years because all revenue gets reinvested back into the business. In the meantime, you can discover a bunch of ways to save your company money.

The post 8 Steps to Leveraging Analytics to Create Successful Ecommerce Stores appeared first on SmartData Collective.

Source : SmartData Collective Read More

3 Data Mining Tips for Companies Trying to Understand their Customers

3 Data Mining Tips for Companies Trying to Understand their Customers

Modern businesses that neglect to invest in big data are at a tremendous disadvantage in an evolving global economy. Smart companies realize that data mining serves many important purposes that cannot be overlooked. The portion of companies with data-driven decision-making models increased from 14% to 34% between 2014 and 2021, as more companies recognize its importance.

One of the most important benefits of data mining is gaining knowledge about customers. Smart companies recognize that they need to use data to accurately understand their customers, rather than rely on unfounded assumptions.

Smart Companies Leverage Data Mining to Identify the Best Customer Groups to Target

Marketing is intrinsic to the continued growth of any business. It is how each company thrives, with the absorption of new customers while retaining current ones, supplying them with the product or service they require, and putting a smile on their face at the same time. But this can be difficult when you don’t market to the correct group of individuals. It is for this reason that your target audience is so crucial. How do you know if your company is fully tapping into potential markets?

You will have an easier time developing an accurate customer profile with data analytics. Companies are spending $20.8 billion on customer analytics, because it has been so effective. The average company that uses customer analytics has 93% higher profits and 81% higher sales than other companies in their industry.

But how do you use data mining to better understand your customers? Here are three ways to determine the correct target audience for your business by leveraging customer analytics.

Utilize Primary Market Research Resources with Data Mining

By utilizing market research, you can begin to understand all the different variables that play a part in the success of your market, and, in turn, your business. Using data mining to conduct market research can help identify demographics and psychographics, market trends, your competitors, the economy, and much, much more. Acquiring this information will allow you to more effectively grasp an idea of the customers in your specific market, as well as how you might be able to tap into that consumer base.

There are a lot of different primary market research resources that you can use to find this information. You can find government data through sites like Census.gov or you can download reports from private market research companies. You can use a Hadoop interface to find the information that you need when you gain access to these reports. You can also download them with a web scraping tool like OctoParse or ParseHub and then use your own data analytics tool to identify the data you need.

Analyze Your Customer Data

Finding the right target audience for your business is extremely important to the success of your business, and shifting your aim towards those that will predominantly be interested in your service or product is without a doubt the wise business decision. However, it’s worth noting that your current customer data should not be ignored during this endeavor. There will likely be a certain amount of overlap between your current customer base and the target market, however, so analyzing your social media analytics and customer feedback, as well as your predominant demographic, can give you a good idea of who you should be advertising to.

This will be a lot easier if you have a CRM software application. Most CRM tools have analytics features built-in, which make it a lot easier to analyze customer information and make more meaningful insights.

Use a Variety of Resources

There are countless resources that contain relevant data that are available to businesses that can allow them to gather consumer thoughts, opinions, and demographic information. One method might include conducting internal surveys or sending them out via email, mail, social media, or other means. Some businesses might also choose to use interviews for one-on-one information, or, alternatively, focus groups, which gather information in group settings. Another easy option would be using cold-calling or telephone surveys to obtain data.

Customer Analytics is Essential for Companies Trying to Create Effective Marketing Strategies

Discovering the correct target audience for your business plays a vital role in furthering your mark as a company, regardless of if you work within the medical industry, like Northwest Surgery Center, the finance industry, or even the manufacturing industry. It’s all a matter of getting the data you need to make your business thrive. You need to leverage the right data mining tools to make the most of it.

The post 3 Data Mining Tips for Companies Trying to Understand their Customers appeared first on SmartData Collective.

Source : SmartData Collective Read More

Use AI to Get the Most Out of Your Social Media Marketing Strategy

Use AI to Get the Most Out of Your Social Media Marketing Strategy

AI technology has become a very important part of modern business. More companies are using AI to automate a number of aspects of their operations and improve their ROI. One study from Accenture found that AI increases the profitability of the average business by 38%.

One of the biggest benefits of AI technology is in the realm of marketing. You can use AI to automate many parts of your marketing strategy. AI can be particularly helpful when it comes to social media marketing.

AI is Invaluable to Social Media Marketing

Nowadays, social media is used for much more than simply commenting on videos and posting pictures of your dog. Many businesses, whether they’re a local restaurant or a medical facility like Northwest Surgery Center, use social media to expand their brand and increase their awareness. It is especially useful for small businesses, which may not have the marketing reach of larger companies.

Despite the obvious benefits of social media marketing, only 48% of companies realize a positive ROI from it. This is largely due to the fact that they don’t use the right tools or develop the right strategy, Companies that take advantage of AI can create more successful social media campaigns.

Here are some ways that small businesses can use AI to create a stellar social media marketing strategy.

Increase Traffic with Automated Content Generation

A business’s website contributes greatly to its overall success, and social media can help its traffic. The more traffic you get to your website, typically the greater results your company will see on all fronts. Without a strong presence in social media, your business is missing out on a large chunk of organic traffic that could be funneling through your website and spreading your message, and, in turn, generating revenue. Increasing your traffic is as simple as starting profiles on relevant social media platforms and creating some beginner content.

You will have an easier time scaling your traffic by leveraging AI to generate content. There are a lot of ways to accomplish this. You can create visual content with tools like PhotoShop, Canva and Illustrator. These tools have sophisticated AI algorithms that make it easier to automate content generation. PhotoShop and Illustrator both use AI to record actions that can be applied on batches of images in a folder. You can also use AI video generation tools like Synthesia.io, Lumen5 or AI Studios. You can also use machine learning article generators like Luminoso and The Click Reader to create blog content.

Build Rapport With Customers by Streamlining Communications with AI

Customers are the driving force behind any successful business, which is why building a strong relationship with them is so vital. Because nearly every person today is involved in at least one or more forms of social media, you have a better chance of reaching a wider audience. Businesses should come off as sincere in their profile, connect to and relate with customers, offer promotions and or giveaways, and use the broad reach to promote themselves. Small businesses have the advantage of building exceptionally strong relationships with local customers, as they may know you personally, or will at least feel compelled to support your local business.

You can help with this by using AI to improve engagement. You can have AI-driven chatbots in Facebook or other AI tools to automate engagement. You can also use AI tools to streamline Instagram engagement too.

Use AI Targeting to Promote Brand Recognition

People feel comfortable with what they know, and, as a business, being instantly recognizable should be on your checklist. The reason social media works so well in promoting your brand recognition is because of how many people it can reach. Nearly everyone uses social media daily, and there are multiple platforms to choose from, too, allowing people to come across brands faster and easier than almost every other form of advertisement. By taking the time to focus on your brand’s profile and how to properly promote it to any and all that happens to come across it, they will feel more comfortable using your services.

However, the goal shouldn’t be to engage with everyone under the sun. You want to use data mining tools to understand your customers and leverage AI to automate your outreach strategy.

AI Should be the Backbone of Your Social Media Strategy

These are only a few of the many ways that social media can be merged with AI tools to help your business, but each of them succeeds phenomenally. You will get a lot more bang for your buck if you know how to use AI strategically in your social media efforts. This should serve as a wake-up call to revamp and utilize your business’s social media to the fullest extent if you wish to boost online traffic and strengthen the bond between you and your customers.

The post Use AI to Get the Most Out of Your Social Media Marketing Strategy appeared first on SmartData Collective.

Source : SmartData Collective Read More

Important Considerations When Migrating to a Data Lake

Important Considerations When Migrating to a Data Lake

Azure Data Lake Storage Gen2 is based on Azure Blob storage and offers a suite of big data analytics features. It is rapidly becoming the primary choice for companies and developers due to its superior performance. If you don’t understand the concept, you might want to check out our previous article on the difference between data lakes and data warehouses.

Data Lake Storage Gen2 combines the file system semantics, directory, file-level security, and scale of Azure Data Lake Storage Gen1 with the low-cost, tiered storage, and high availability/disaster recovery capabilities of Azure Blob storage.

In this article, I will walk you through the process of migrating your data to data lakes.

1. Determine your preparedness

Before anything, you need to learn about the Data Lake Storage Gen2 solution, including its features, prices, and overall design. Compare and contrast the capabilities of Gen1 with those of Gen2. You also want to get an idea of the benefits of data lakes.

Examine a list of known issues to identify any gaps in functionality. Blob storage features like diagnostic logging, access levels, and blob storage lifecycle management policies are supported by Gen2. Check the current level of support if you want to use any of these features. Examine the current level of Azure ecosystem support to ensure that any services on which your solutions rely are supported by Gen2.

What are the differences between Gen1 and Gen2?

Data organization

Gen 1 provides hierarchical namespaces with file and folder support. Gen 2 provides all of this as well as container security and support.

Authorization

Gen 1 uses ACLs for data authorization, while Gen 2 uses ACLs and Azure RBAC for data authorization.

Authentication

Gen 1 supports data authentication with Azure Active Directory (Azure AD) managed identity and service principles, whereas Gen 2 supports data authentication with Azure AD managed identity, service principles, and shared access key.

These are the major differences between Gen 1 and Gen 2. Having understood these feature diffrenciations, if you feel the need to move your data from Gen 1 to Gen 2, simply follow the methods as mentioned below.

2. Get ready to migrate

Identify the data sets that you’ll migrate

Take advantage of this chance to purge data sets that are no longer in use and migrate the particular data you need or want in the future. Unless you want to transfer all of your data at once, now is the time to identify logical categories of data that may be migrated in stages.

Perform aging analysis (or equivalent) on your Gen1 account to determine whether files or folders need to remain in inventory for an extended period of time or are they becoming outdated.

Determine the impact of migration

Consider, for example, if you can afford any downtime during the relocation. Such factors might assist you in identifying a good migration pattern and selecting the best tools for the process.

Create a migration plan

We can choose one of these patterns, combine them together, or design a custom pattern of our own.

Lift and shift pattern

This is the most basic pattern.

In it, first and foremost, all Gen1 writes need to be halted. Then, the data is transferred from Gen1 to Gen2 via the Azure Data Factory or the Azure Portal, whichever is preferred. ACLs are copied along with the data. All input activities and workloads are sent to Gen2. Finally, Gen1 is deactivated.

Incremental copy pattern

In this pattern, you start migrating data from Gen1 to Gen2 (Azure Data Factory is highly recommended for this pattern of migration). ACLs are copied along with the data. Then, you can start copying new data from Gen1 in stages. When all the data has been transferred, stop all writes to Gen1 and redirect all workloads to Gen2. Finally, Gen1 is destroyed.

Dual pipeline pattern

In this pattern, you start migrating data from Gen1 to Gen2 (Azure Data Factory is highly recommended for dual pipeline migration). ACLs are copied along with the data. Then, you incorporate new data into both Gen1 and Gen2. When all data has been transferred, stop all writes to Gen1 and redirect all workloads to Gen2. Finally, Gen1 is destroyed.

Bi-directional sync pattern

Set up bi-directional replication between Gen1 and Gen2 (WanDisco is highly recommended for bi-directional sync migration). For existing data, it has a data repair feature. Now, stop all writes to Gen1 and switch off bi-directional replication once all movements have been completed. Finally, Gen1 is exterminated.

3. Migrate data, workloads, and applications

Migrate data, workloads, and applications using the preferred pattern. We propose that you test cases in small steps.

To begin, create a storage account and enable the hierarchical namespace functionality. Then, move your data. You can also configure the services of your workloads to point to your Gen2 endpoint.

4. Switch from Gen1 to Gen2

When you’re certain that your apps and workloads can rely on Gen2, you may start leveraging Gen2 to meet your business requirements. Decommission your Gen1 account and turn off any remaining pipes that are running on it.

You can also migrate your data through the Azure portal.

Conclusion

While switching from Gen1 to gen2 might seem like a complex and daunting task, it brings with it a host of improvements in features that you will greatly benefit from in the long run. Keep in mind that the key question when it comes to implementing this shift is asking yourself how you can leverage Gen2 to suit your business requirements.

I hope in this article you get a clear explanation of how to migrate your data to data lake storage.

The post Important Considerations When Migrating to a Data Lake appeared first on SmartData Collective.

Source : SmartData Collective Read More