Our client has many business users who need access to the Redshift cluster for analysis and it is not really practical to create and maintain users directly on the cluster for every user so they ended up in sharing the same user credentials to everyone now the business users started to abuse the cluster by sending many poorly written queries.

We wanted to solve both problems of not creating users directly on the cluster and at the same time, knowing who executed what query on the cluster so that we can educate them to send queries with the right filters. Our customer is already using Active Directory (AD) so we decided to integrate the AD with Redshift using ADFS so I am going to explain more about it in this blog.

This blog assumes that you already have knowledge on AWS Redshift, IAM, AD and ADFS and if you are new to these topics then please visit the following blogs first.

Setup

Redshift access with ADFS integration process

AWS Redshift Setup

Connect to your cluster using your master user and password.

Create DB Groups for AD users to join, here we are creating a DB group for read-only access on all the table.

CREATE GROUP readonly; -- Following command needs to be run every-time you create new tables GRANT SELECT on ALL TABLES IN SCHEMA public TO group readonly; -- For non-public schemas you will also need to grant USAGE PRIVILEGES GRANT USAGE ON SCHEMA <non-public-schema-name> TO group readonly;

GRANT SELECT on ALL TABLES IN SCHEMA <non-public-schema-name> TO group readonly;

Please use only lower case letters as DB group name otherwise connection errors occur while connecting to redshift using ADFS.

You can create more groups as per your use case if you do that you need to map them to your AD groups also using “Relying Party Trust” claim rules.

AD Setup

If you don’t have AD already then please follow this blog to create it.

We recommend you to create separate AD groups to provide access to your AD users to the Redshift cluster as it gives better control on managing access to the cluster.

Create an AD group called Redshift-readonly.

Create an AD group with name Redshift-readonly

Please note the format for AD group name: Redshift-{DbGroupName}. Redshift- prefix for the AD group name is very important as it will be used in “Relying Party Trust” claim rules while configuring ADFS.

AWS IAM Setup

Once AD authentication is successful, IAM will provide the temporary AWS credentials. And then we need to call GetClusterCredentials Redshift API with those AWS credentials to get the temporary DB credentials. This API creates the user and adds to a specified DbGroups so we also need to provide “redshift:CreateClusterUser” and “redshift:JoinGroup” permissions to the IAM role.

Create an IAM identity provider by the following instruction from this blog.

Create IAM policy with Redshift cluster temporary credentials creation permissions.

{

"Version": "2012-10-17",

"Statement": [

{

"Sid": "GetClusterCredsStatement",

"Effect": "Allow",

"Action": [

"redshift:GetClusterCredentials"

],

"Resource": [

"arn:aws:redshift:*:*:dbname:<cluster-identifier>/<db-name>",

"arn:aws:redshift:*:*:dbuser:<cluster-identifier>/${redshift:DbUser}"

]

},

{

"Sid": "CreateClusterUserStatement",

"Effect": "Allow",

"Action": [

"redshift:CreateClusterUser"

],

"Resource": [

"arn:aws:redshift:*:*:dbname:<cluster-identifier>/<db-name>",

"arn:aws:redshift:*:*:dbuser:<cluster-identifier>/${redshift:DbUser}"

]

},

{

"Sid": "RedshiftJoinGroupStatement",

"Effect": "Allow",

"Action": [

"redshift:JoinGroup"

],

"Resource": [

"arn:aws:redshift:*:*:dbgroup:<cluster-identifier>/readonly"

]

}

]

}

Create an IAM role for ADFS using previously create identity provider with SAML 2.0 federation option.

AWS IAM Role for ADFS to login to Redshift cluster

Attribute: SAML:aud

Value: https://signin.aws.amazon.com/saml

Attach the previously created policy to this role.

ADFS Setup

In this blog, I am going to talk only about the Redshift authentication related integration with ADFS so in case you are new to ADFS then please go through this blog first to understand concept of integrating ADFS with AWS.

Create “Relying Party Trust” claim rule to store the AD groups logging user belongs to a temporary variable: “http://temp/variable” using the Custom rule template.

Expose the user AD groups as a temporary variable



=> add(store = "Active Directory", types = (" c:[Type == " http://schemas.microsoft.com/ws/2008/06/identity/claims/windowsaccountname ", Issuer == "AD AUTHORITY"]=> add(store = "Active Directory", types = (" http://temp/variable "), query = ";tokenGroups;{0}", param = c.Value);

Map AD groups to AWS IAM role, this role will give access to Redshift cluster after the user is successfully authenticated with AD.

AD groups to IAM role with redshift access

Map Role SessionName and DbUser to user’s email address. This will help us to know the logged in user identity.

Map RoleSession and DbUser to EmailAddress

Map AD groups related to Redshift to DbGroups. Here I am using a regular expression to get DbGroups from AD groups which got stored in “http://temp/variable” in the previous rule.

AD groups to Redshift DB groups

Add claim rule for Redshift AutoCreate DB user attribute.

Redshift AutoCreate DB user Claim rule

SQL Workbench/J setup

Please setup SQL Workbench to connect with Redshift using this AWS documentation. Make sure that you are using a driver with AWS Java SDK or set it in the classpath.

Create credentials profile in <Home-Directory>/.aws/credentials for ADFS authentication.

[redshift-adfs-read] plugin_name=com.amazon.redshift.plugin.AdfsCredentialsProvider idp_host=<your-adfs-idp-endpoint> idp_port=443 preferred_role=arn:aws:iam::<AWSAccount-ID>:role/Redshift-readonly

You need to add ssl_insecure=true to above configuration in case you are using self signed certificates in your test setup.

Connect to Redshift using the following JDBC connection.

jdbc:redshift:iam://<cluster-end-point>:<db-port>/<db-name>?Profile=redshift-adfs-read

You can also use Extentend Properties from SQL workbench directly instead of the Profile to set the properties plugin_name, idp_host, idp_port, preferred_role.

Extended Properties for SQL Workbench/J

# If you are using extended properties then use the following connection string:

jdbc:redshift:iam://<cluster-end-point>:<db-port>/<db-name>d

The connection to Redshift is successful using AD authentication

Now we can see the queries run by various users from Redshift console directly.

Queries run on the cluster by various users

Conclusion

We have learned how to use AD authentication to log in to Redshift. This surely helps us to analyze the Redshift performance and educate the user making improper queries to the cluster. I know this is pretty complex setup so you may not get it right in one go so please try it with patience and feel free to contact me for any help or clarifications.