Leveraging OCI Functions (Custom Scripts) in Oracle Analytics Cloud - Part 2
In my previous article I wrote about creating OCI functions from scratch. Now it's time to show you how to register your functions in Oracle Analytics Cloud and invoke them from data flows to transform your data.
Ensuring OCI Functions are Compatible
Once your OCI function has been created, you must add the oac-compatible
free-form tag to it to ensure that it's compatible with Oracle Analytics Cloud.
Open the navigation menu in the OCI Console, select Developer Services, and click on the Applications option (Figure 1).
In the Applications page, select the appropriate compartment, and click on the application where you created the function (Figure 2).
In your application page, scroll down and select the Functions option in the Resources pane (Figure 3). Then click on the function name.
In your function page, click on the Add tags button (Figure 4).
In the Add tags dialog, specify None (add a free-form tag)
as Tag namespace, oac-compatible
as Tag key, and True
as Tag value. Then click on the Add tags button (Figure 5).
Open the Tags tab and click Free-form tags to confirm that the tag has been created properly (Figure 6).
Failure to setup the oac-compatible
tag properly will prevent the function from being available in Oracle Analytics Cloud (it will appear as grayed out when you try to register it).
Creating an OCI Resource Connection
Before registering any function in Oracle Analytics Cloud, it's required to create an OCI Resource connection to the tenancy where your OCI Functions service is running.
In the Home page of Oracle Analytics Cloud, click on the Create button and select the Connection option (Figure 7).
In the Create Connection dialog, select the OCI Resource connection type (Figure 8).
Then specify a name for the connection, an optional description, your tenancy OCID, your user OCID, and click on the Generate button (Figure 9).
Do not click on the Save button yet! First you have to copy and paste the newly generated API key in OCI, otherwise you will get an error when saving the connection.
Log in to the OCI Console in a new tab, expand the Profile menu, and select the User settings option (Figure 10).
In the User Details page, scroll down to display the Resources pane, select the API Keys option, and click on the Add API Key button (Figure 11).
In the Add API Key dialog, tick the Paste Public Key radio button, paste your API key in the Public Key textbox, and click on the Add button (Figure 12).
Now go back to the Create Connection dialog in Oracle Analytics Cloud, and click on the Save button. Navigate to the Data page, and select the Connections tab to confirm that your OCI Resource connection has been created properly.
Registering the Function
Once the OCI Resource connection has been created, we can register our function(s) in Oracle Analytics Cloud.
In the Home page, expand the Page menu, hover over Register Model/Function, and select the OCI Functions option (Figure 13).
The Register a Custom Function dialog is displayed (Figure 14). Click on the connection created previously to proceed.
In the Select an Application dialog (Figure 15), click on the Select button to locate your compartment, and then click on the application that contains the function you want to register.
The Select a Function dialog displays a list with all the functions defined in the selected application (Figure 16). Select the function that you want to register, and click on the Register button to complete this task.
Navigate to the Machine Learning page, and select the Scripts tab to confirm that your function has been registered properly.
Invoking the Function
Registered functions can be invoked from data flows to transform your data. As an example, I'm going to invoke my detect-language function to detect the language of the eFootball reviews I scraped from Metacritic.
In the Home page of Oracle Analytics Cloud, click on the Create button and select the Data Flow option to create a new data flow (Figure 17).
In the Add Data dialog, select the dataset that you want to use as an input for the function, and click on the Add button (Figure 18).
Click on the Add a step icon icon, and select the Apply Custom Script step (Figure 19).
In the Select Custom Script dialog, select the function that you want to invoke, and click on the OK button (Figure 20).
In the Apply Custom Script section, select the function outputs that you want to add to the result set, and map the columns in your dataset to each of the function input parameters (Figure 21).
From here you can apply additional transformations, add a Save Data step to the flow, save and run the data flow to make your result set persistent.
Troubleshooting
By default, OCI (Python) functions have a maximum memory threshold of 256 MB and a maximum time allowed to run of 30 seconds: if they are exceeded during execution, functions are stopped and error messages are logged.
To override the default values, navigate to your function page in OCI Console, and click on the Edit button (Figure 22).
In the Edit function dialog, specify alternative values for Memory (in MBs) and Timeout (in seconds) according to your requirements, then click on the Save changes button (Figure 23). The good practice is to specify values that are close to what the functions are actually likely to require, rather than significantly more.
If for any reason the function fails when invoked from a data flow, Oracle Analytics Cloud only displays a generic error message. For this reason it becomes fundamental to enable logging for your OCI application, otherwise debugging your code will be a nightmare!
To enable logging for your function, open the navigation menu in the OCI Console, select Observability & Management, and click on the Logs option (Figure 24).
In the Logs page, click on the Enable service log button (Figure 25).
In the Enable Resource Log dialog, select the proper compartment, Functions
from the Service menu, your application from the Resource menu, and Function Invocation Logs
from the Log Category menu. Then specify a name for the log and click on the Enable Log button (Figure 26).
From the Logs page you can now click on the name of your log to investigate log data. It's also possible to write custom messages to the log by using the following code in your function:
import logging
[...]
logging.getLogger().info("CUSTOM_MESSAGE")
As an example, I intentionally mispelled the name of the bucket in my function definition before invoking it. The data flow fails, but the error message displayed in Oracle Analytics Cloud is too generic and does not allow me to identify the cause of the issue (Figure 27).
Now, luckily logging is enabled and it provides all the details that I need for troubleshooting (Figure 28).
Conclusion
This article illustrates how to register and invoke a function from a data flow. Business analysts and end-users often want greater control when performing data preparation tasks, and in this context, leveraging OCI functions into Oracle Analytics Cloud can give them full control and flexibility over specific data processing needs.
If you are looking into leveraging OCI functions into Oracle Analytics Cloud and want to find out more, please do get in touch or DM me on Twitter @barretbse.
Member discussion