How to build and provision Azure Cognitive Search service in 20 minutes

This Azure service has been here for a while now, but lately got a few improvements which make the integration and use of it even easier and seamless than before.

Just before going any further, if you haven’t read anything about it, I recommend you to start with this article first https://docs.microsoft.com/en-us/azure/search/search-what-is-azure-search as I am not going to dive too much into the details today. This reading is going to be about my personal experience of getting Cognitive Search service provisioned with a bunch of data (of one source) connected to it.

Personally, I like to look at the problems to solve with my business lense. What that means is that focusing on building a content (business value) rather than building the search engine (feature). I am not saying that compromising (non-business related) system features which are helping users to enhance their User system Experience is a good thing to do. All I am saying is stop re-inventing the wheel!

Hey Devs, don’t give me this wiggle face saying things like: “c’mon, it’s not that hard to do it by yourself!”. Yes, but actually it is hard from the time complexity point of view … To build a great search engine with features your audience is going to like would take many weeks of man/hours to do so. These features including Auto-completion, geospatial search, filtering, and faceting capabilities for a rich UX, OCR (ideally backed by AI), key phrase extraction, image text found results highlighting and all of that with ability to scale this service as needed and add as many multiple (and different) data sources as needed.

Can you see my point, now? Did one of your eyebrow just lift up? :)) Anyway .. let’s jump into it and see how long this is going to take me to build in Azure portal.

Steps how to build a first Cognitive Search as a Service

1) Go to Azure portal, search for Cognitive Services and add new one called “Azure Cognitive Search

Adding Azure Cognitive Search to Cognitive Service library in the Azure portal

2) As for all services in Azure space, you need to fill up what Subscription and Resource Group this service will belong to. And as the next step, prefered URL, Geographic location of data centre and pricing Tier. I am choosing the free Tier (should be enough for this exercise) and location close to NZ. The next step is to click on Validate, and on Create button afterwards.

Filling up the service initials

3) The first step in the wizard is “Connect to your data” tab. That means that on this page you can connect to multiple data sources. As you can see from the picture below, quite a few options are available to choose from (and most likely going to cover all of the use case scenarios). For this exercise, I am going to take “Samples” and SQL database. You can add as many data sources as you want (with the respect on limitations of the selected service Tier type).

Adding a connection to the data source

4) At “Add cognitive skills” tab I decided to add a bunch of additional Text Cognitive Skills, even though this step is optional. My reasons are purely investigative and I would like to see how the @search.score field in returning data result sets is going to look like when trying to search my documents by any of these fields from Enriched data set.

Adding extra source fields for cognitive skills run

5) In the next step “Customize target index” (sometimes referred to as a “pull model“) I am going to leave all pre-populated settings as they are as I am happy with it for now. In this step you can configure things like the level of data exposure, data field types, filtering, sorting, etc.

Just to give you a better understanding of what the search index is in this context – think about it as in relational database a search index equates to a table. And also we have documents, which are the items of the index. Think about them as documents are roughly equivalent to rows in a table.

Also, remember to keep a Key field in Edm.String data type. This is a mandatory prerequisite.

Customizing the target indexes

6) In “Create an Indexer” tab (the way how to index data in a scheduled manner) I am not allowed to configure how often should be mapping table (index) build. Reason for it is that the Sample SQL database I am using in this exercise does not use any Change tracking policies (for example  SQL Integrated Change Tracking Policy). Why is needed? Well, basically Cognitive search needs to know when the data delete change happened to address that. You can read more about it here.

For now, I am going to submit this form and moving on.

The service starts provisioning itself (this should not take long to finish) and after a couple of minutes, I should have everything ready for testing.

Create an indexer tab

Testing the Search Service

Now, let’s have a look at “Search explorer” from service level main top menu and craft some data queries. My first query was “Bacheor-Wohnung” word, which nicely got populated into the URL query as value of &search element by itself…

Data result set from an example query

From now on it is all about to know how to use a query syntax (and you can really go hard on this). For more search query examples visit this MS documentation https://docs.microsoft.com/en-us/azure/search/search-explorer?WT.mc_id=Portal-Microsoft_Azure_Search

I have to say that building this service did take me about 20 minutes (for someone who has some experience already) from having nothing to easy-to-configure and scale search engine. Anyone should be able to build the first Cognitive search service by a similar time after reading this post now.

If there are any questions or want to know more about this service, visit this site built by Microsoft at https://docs.microsoft.com/en-us/azure/search/. These people did a really great job in documenting all of it. This material should help you to elevate your skills onto a more advanced level.

What is the Azure Cognitive Search Tiers pricing

_FREEBASICSTANDARD S1STANDARD S2STANDARD S3STORAGE OPTIMIZED L1STORAGE OPTIMIZED L2
Storage50 MB2 GB25 GB
(max 300 GB per service)
100 GB
(max 1 TB per service)
200 GB
(max 2 TB per service)
1 TB
(max 12 TB per service)
2 TB
(max 24 TB per service)
Max indexes per service31550200200 or 1000/partition in high density1 mode1010
Scale out limitsN/AUp to 3 units per service
(max 1 partition; max 3 replicas)
Up to 36 units per service
(max 12 partition; max 12 replicas)
Up to 36 units per service
(max 12 partition; max 12 replicas)
Up to 36 units per service
(max 12 partition; max 12 replicas)
up to 12 replicas in high density1 mode
Up to 36 units per service
(max 12 partition; max 12 replicas)
Up to 36 units per service
(max 12 partition; max 12 replicas)
up to 12 replicas in high density1 mode
Document Cracking: Image ExtractionN/A
(only 20 documents supported)
(price per 1,000 images)
0-1M images – $1.512
1M-5M images – $1.210
5M+ images – $0.983
(price per 1,000 images)
0-1M images – $1.512
1M-5M images – $1.210
5M+ images – $0.983
(price per 1,000 images)
0-1M images – $1.512
1M-5M images – $1.210
5M+ images – $0.983
(price per 1,000 images)
0-1M images – $1.512
1M-5M images – $1.210
5M+ images – $0.983
(price per 1,000 images)
0-1M images – $1.512
1M-5M images – $1.210
5M+ images – $0.983
(price per 1,000 images)
0-1M images – $1.512
1M-5M images – $1.210
5M+ images – $0.983
Private Endpoints Related ChargesN/AAdditional charges may apply2Additional charges may apply2Additional charges may apply2Additional charges may apply2Additional charges may apply2Additional charges may apply2
Price per unitFree$0.153/hour$0.509/hour$2.033/hour$4.065/hour$5.805/hourN/A
Azure Cognitive Search Tiers pricing

Overall Azure service rating

  • it is very easy to create own search SaaS in couple of minutes
  • intuitive way how to integrate new data sources to the service
  • easy to leverage cognitive capabilities in features like OCR
  • CONVENIENCE – zero coding required on service side, all search service settings can be configured in Azure portal
5/5 Rambo rating