LEARN SAML DEEP DIVE
Learn SAML: From XML Basics to Building an IdP/SP from Scratch
Goal: To deeply understand Security Assertion Markup Language (SAML) by implementing its core components—message formats, bindings, and security mechanisms—from scratch in C. This journey will demystify enterprise Single Sign-On (SSO) and give you a first-principles understanding of federated identity.
Why Learn SAML from Scratch?
SAML is the backbone of countless enterprise SSO solutions. While many applications integrate with SAML using existing libraries, truly understanding SAML’s intricacies requires grappling with its XML-based structure, cryptographic requirements, and various communication patterns. This is particularly true when debugging complex integrations or working with highly specialized environments.
By implementing a simplified Service Provider (SP) and Identity Provider (IdP) in C, you will move beyond library abstractions. You’ll confront the challenges of XML parsing, digital signature verification, and HTTP message handling, giving you an unparalleled insight into how federated identity truly works.
After completing these projects, you will:
- Understand the roles of Identity Providers (IdP) and Service Providers (SP) in federated identity.
- Be able to generate and parse SAML XML messages, including
AuthnRequestandSAMLResponse. - Implement SAML message bindings, specifically HTTP Redirect and HTTP POST.
- Understand and implement XML Digital Signatures for SAML assertions.
- Grasp the core SAML workflow for Web Browser SSO and Single Logout (SLO).
- Be equipped to debug any SAML integration issue.
Core Concept Analysis
The Federated Identity Landscape
SAML solves the problem of a user needing to authenticate separately with multiple applications across different security domains. Instead, they authenticate once with an Identity Provider (IdP), which then asserts their identity to various Service Providers (SPs).
+---------------------+ +-----------------------+
| User's Browser | | Service Provider |
| | | (e.g., Salesforce) |
| | 1. Request protected resource | |
| +-------------------------------------->| |
| | | 2. Redirect with |
| | <-------------------------------------+ AuthnRequest |
| | | |
| | 3. Redirect with AuthnRequest | |
| +-------------------------------------->| |
| | | |
+----------+----------+ +-----------+-----------+
| |
| 4. Authenticate User (if needed) |
v v
+----------+----------+ +-----------+-----------+
| Identity Provider | | Service Provider |
| (e.g., Okta, ADFS) | | (e.g., Salesforce) |
| | | |
| | 5. SAML Response (signed assertion) | |
| |<--------------------------------------+ |
| | | 6. Validate Response, |
| | | Establish Session |
| | |<----------------------+
+---------------------+ | |
+-----------------------+
Key SAML Concepts
- Identity Provider (IdP): The authority that authenticates the user and issues SAML assertions about them.
- Service Provider (SP): The application or service that relies on the IdP for user authentication. It consumes SAML assertions.
- SAML Assertion: An XML document issued by the IdP that contains statements about an authenticated user. It’s the core “proof” of identity.
AuthenticationStatement: Confirms the user was authenticated and how/when.AttributeStatement: Provides user attributes (e.g., email, role).NameID: A unique identifier for the user.
- SAML Protocol Messages: XML documents used for requesting and responding.
AuthnRequest: Sent by the SP to the IdP to request user authentication.Response: Sent by the IdP to the SP, containing a SAML assertion.
- SAML Bindings: Define how SAML protocol messages are transported over network protocols (like HTTP).
- HTTP Redirect Binding: Messages are URL-encoded, optionally compressed, and placed in the query string of an HTTP 302 redirect. Used for small messages like
AuthnRequest. - HTTP POST Binding: Messages are Base64-encoded and embedded as a hidden field in an HTML form. This form is then automatically submitted via JavaScript. Used for larger messages like
SAMLResponse.
- HTTP Redirect Binding: Messages are URL-encoded, optionally compressed, and placed in the query string of an HTTP 302 redirect. Used for small messages like
- SAML Profile: A set of rules that defines how assertions, protocols, and bindings combine to support a specific use case (e.g., Web Browser SSO Profile).
- SAML Metadata: An XML document that describes the configuration of an IdP or SP (endpoints, supported bindings, public certificates, entity ID). It enables automated configuration.
- Security:
- XML Digital Signatures: Used to ensure the integrity and authenticity of SAML messages and assertions. The IdP signs the assertion, and the SP verifies it using the IdP’s public certificate.
- Encryption: SAML assertions (especially sensitive attributes) can be encrypted for confidentiality.
Project List
These projects guide you through implementing core SAML features. We’ll use C, leveraging libxml2 for XML parsing and OpenSSL for cryptography, as these are the standard low-level libraries for such tasks.
Project 1: Basic XML Structure Generator/Parser for SAML Messages
- File: LEARN_SAML_DEEP_DIVE.md
- Main Programming Language: C
- Alternative Programming Languages: Python (lxml), Java (javax.xml)
- Coolness Level: Level 2: Practical but Forgettable
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 1: Beginner
- Knowledge Area: XML Parsing / Data Structures
- Software or Tool:
libxml2 - Main Book: “SAML 2.0 Identity Federation” by Aaron Sachs, et al.
What you’ll build: A C program that can generate the basic XML structure for a SAML AuthnRequest and parse a simple SAML Response to extract key information like the Assertion’s NameID and Issuer.
Why it teaches the fundamentals: SAML is XML. This project forces you to understand the specific XML schemas, namespaces, and element hierarchy that constitute SAML messages. You’ll learn how to programmatically create and navigate complex XML documents using libxml2.
Core challenges you’ll face:
- Understanding XML namespaces → maps to
libxml2functions for creating/finding nodes with namespaces. - Generating a unique ID → maps to creating UUIDs for
IDattributes. - Setting
IssueInstanttimestamps → maps to formatting dates/times in UTC forxs:dateTime. - Navigating XML trees → maps to using
libxml2’sxmlDocGetRootElement,xmlChildElement,xmlNextElementSibling.
Key Concepts:
- SAML 2.0 Core Specification: Section 3 (SAML Assertions), Section 4 (SAML Protocols).
- XML Namespaces: Crucial for avoiding element name collisions, especially in SAML where multiple schemas are used.
libxml2Basics: The library functions for creating and manipulating XML documents.
Difficulty: Beginner
Time estimate: 1-2 days
Prerequisites: Basic C programming, familiarity with Makefiles to link libxml2.
Real world outcome:
<!-- Example AuthnRequest generated by your tool -->
<samlp:AuthnRequest xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
ID="_abc123"
Version="2.0"
IssueInstant="2025-12-21T10:00:00Z"
AssertionConsumerServiceURL="https://sp.example.com/acs">
<saml:Issuer xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion">https://sp.example.com/metadata</saml:Issuer>
</samlp:AuthnRequest>
And your parser program should be able to take such an XML string and extract values like _abc123, 2025-12-21T10:00:00Z, https://sp.example.com/acs, and https://sp.example.com/metadata.
Implementation Hints:
- Include
libxml/parser.handlibxml/tree.h. - Use
xmlNewNodeandxmlNewNsto create elements with correct namespaces. - Use
xmlNewPropto add attributes. - For parsing,
xmlParseMemoryto get anxmlDocPtr, thenxmlDocGetRootElementto start traversing. - Use
xmlGetPropto read attribute values andxmlNodeGetContentto read element text content. - Remember to call
xmlCleanupParser()at the end.
Learning milestones:
- You can generate a basic SAML
AuthnRequestXML string with correct namespaces and attributes. → You understand SAML XML structure. - You can parse a simple SAML
Response(provided as a static string) and extract its top-level attributes. → You understand XML parsing. - You can correctly extract the
NameIDandIssuerfrom a simpleAssertionwithin aResponse. → You’re navigating SAML content.
Project 2: Implement HTTP Redirect Binding (SP Initiated AuthnRequest)
- File: LEARN_SAML_DEEP_DIVE.md
- Main Programming Language: C
- Alternative Programming Languages: Python, Node.js
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 2: Intermediate
- Knowledge Area: HTTP / Data Encoding / Compression
- Software or Tool:
libcurl(optional, for making HTTP calls),zlib - Main Book: “SAML 2.0 Identity Federation” by Aaron Sachs, et al.
What you’ll build: Extend Project 1. Create a simple SP-side program (it can be a C program that prints the URL to stdout, or a minimal web server). When a user (or curl from a terminal) tries to access a protected resource, it generates an AuthnRequest, then inflates/deflates, Base64-encodes, and URL-encodes it. Finally, it constructs a full HTTP 302 Redirect URL to an (imaginary) IdP’s SSO endpoint.
Why it teaches the fundamentals: This project reveals how SAML messages are actually transmitted over HTTP. You’ll learn about DEFLATE compression, Base64 encoding, and URL encoding – fundamental web technologies that SAML heavily relies on.
Core challenges you’ll face:
- Deflating XML data → maps to using
zlibto compress the XML string. - Base64 encoding → maps to implementing a Base64 encoder or using an existing library function.
- URL encoding (or percent-encoding) → maps to converting special characters (like
/,+,=) to their%XXequivalents. - Constructing an HTTP Redirect URL → maps to building the
Locationheader for a 302 response.
Key Concepts:
- SAML 2.0 HTTP Redirect Binding: The specific rules for packaging SAML messages into a URL.
zlibcompression: “RFC 1951 - DEFLATE Compressed Data Format”.- Base64 Encoding: “RFC 4648 - Base64 Encoding”.
- URL Encoding: “RFC 3986 - Uniform Resource Identifier (URI): Generic Syntax”.
Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 1, basic understanding of HTTP headers.
Real world outcome: Your SP program will output a URL that, if opened in a browser, would redirect the user to an IdP:
$ ./generate_saml_redirect_url
DEBUG: Generated AuthnRequest: <samlp:AuthnRequest ...>
DEBUG: Deflated and Base64-encoded: fVK/T...
DEBUG: URL-encoded: fVK%2FT...
Redirect URL: https://idp.example.com/sso?SAMLRequest=fVK%2FT...&RelayState=some_opaque_value
Implementation Hints:
- Start with the XML generated in Project 1.
- Use
zlibfunctions likedeflateInit,deflate,deflateEndto compress the XML string. Store the compressed bytes. - Implement (or find a public domain implementation of) a Base64 encoder to convert the compressed bytes into a Base64 string.
- Implement a URL encoder. Be careful with characters like
=,&,?. - Construct the final URL. The
RelayStateparameter is optional but often included. - If you want to simulate an HTTP server, you’d print
HTTP/1.1 302 Found\nLocation: <your_url>\n\nto the socket.
Learning milestones:
- You can compress a string using
zliband decompress it back. → You understand DEFLATE. - You can Base64-encode and decode a string. → You understand Base64.
- You can URL-encode a string correctly. → You understand URL encoding.
- You can generate a full SAML
AuthnRequestURL via HTTP Redirect Binding. → You’ve mastered SAML request transport.
Project 3: Implement HTTP POST Binding (IdP Response)
- File: LEARN_SAML_DEEP_DIVE.md
- Main Programming Language: C
- Alternative Programming Languages: Python, Node.js
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 2: Intermediate
- Knowledge Area: HTTP / HTML Form Generation
- Software or Tool: Basic C web server (socket programming)
- Main Book: “SAML 2.0 Identity Federation” by Aaron Sachs, et al.
What you’ll build: Create a basic IdP-side C web server (e.g., listening on port 8080). This server will:
- Receive an incoming HTTP GET request containing an
AuthnRequest(from Project 2, if you implemented a web server for your SP). - (For now, just acknowledge the request and assume authentication success).
- Construct a dummy SAML
ResponseXML (similar to Project 1, but more detailed, includingAssertionwithNameIDandAudienceRestriction). - Base64-encode this
SAMLResponse. - Generate an HTML form containing the Base64-encoded response as a hidden input field, along with a JavaScript snippet to auto-submit the form to a specified SP’s Assertion Consumer Service (ACS) URL.
Why it teaches the fundamentals: This project implements the other primary SAML binding. You’ll learn how the IdP sends its authentication results back to the SP through the user’s browser, using an auto-submitting HTML form.
Core challenges you’ll face:
- Setting up a basic C web server → maps to socket programming for HTTP requests/responses.
- Parsing incoming URL query parameters → maps to extracting the
SAMLRequestfrom the GET URL. - Generating SAML
ResponseXML → maps to extending Project 1’s XML generation for aResponsestructure. - Embedding Base64-encoded XML in HTML → maps to understanding the structure of the HTML form for HTTP POST binding.
Key Concepts:
- SAML 2.0 HTTP POST Binding: How SAML messages are embedded in HTML forms.
- HTTP
GETandPOSTmethods: The difference in how data is sent. NameIDandAudienceRestriction: Key elements in a SAMLAssertionfor user identification and preventing replay attacks.
Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 2, basic socket programming in C.
Real world outcome: Your IdP server, when hit by a browser, will return an HTML page like this. When the browser receives this, it will automatically submit the form to the SP.
<html>
<head><title>SAML 2.0 SSO</title></head>
<body onload="document.forms[0].submit()">
<noscript>
<p><strong>Note:</strong> Since your browser does not support JavaScript,
you must press the Continue button once to proceed.</p>
</noscript>
<form action="https://sp.example.com/acs" method="post">
<div>
<input type="hidden" name="SAMLResponse" value="<BASE64_ENCODED_SAML_RESPONSE_XML>"/>
<input type="hidden" name="RelayState" value="some_opaque_value"/>
</div>
<noscript>
<div>
<input type="submit" value="Continue"/>
</div>
</noscript>
</form>
</body>
</html>
Implementation Hints:
- Set up a simple TCP server that listens on a port (e.g., 8080).
- When a connection comes in, read the HTTP GET request line.
- Extract the
SAMLRequestparameter from the URL. (You’ll need to URL-decode and Base64-decode it to get the raw XML for parsing, but for this project, you can just acknowledge its presence.) - Generate a
SAMLResponseXML with a dummyNameID(e.g., “testuser”) andAudienceRestrictionset to the SP’s entity ID. - Base64-encode the
SAMLResponseXML string. - Construct the full HTML form shown above, inserting your encoded response and the SP’s ACS URL.
- Send the
HTTP/1.1 200 OKheader followed by the HTML form over the socket.
Learning milestones:
- Your IdP server can receive a basic HTTP GET request. → You understand basic HTTP server functionality.
- Your IdP can generate a SAML
ResponseXML with anAssertioncontaining aNameIDandAudienceRestriction. → You’re generating core SAML content. - Your IdP can construct the correct HTML form for HTTP POST binding, including the Base64-encoded
SAMLResponseand auto-submit JavaScript. → You’ve mastered SAML response transport.
Project 4: Digital Signatures for SAML Assertions
- File: LEARN_SAML_DEEP_DIVE.md
- Main Programming Language: C
- Alternative Programming Languages: Python (xmlsec), Java (Apache Santuario)
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 3: Advanced
- Knowledge Area: Cryptography / XML Security
- Software or Tool:
OpenSSL(for certificates and signing),libxml2(XMLDSig part) - Main Book: “SAML 2.0 Identity Federation” by Aaron Sachs, et al. (Chapter 6: XML Signatures and Encryption)
What you’ll build: Extend Project 3. The IdP should now digitally sign the Assertion element within the SAMLResponse using a private key and X.509 certificate. The SP (from Project 5, if it’s already implemented) should be able to receive this signed response and verify the signature using the IdP’s public certificate.
Why it teaches the fundamentals: Digital signatures are critical for SAML security. Without them, an SP cannot trust that an assertion truly came from the IdP or that it hasn’t been tampered with. This project forces you to understand XML Digital Signature (XMLDSig) standards, certificate management, and cryptographic verification flows.
Core challenges you’ll face:
- Generating X.509 certificates and private keys → maps to using
openssl reqandopenssl genrsa. - XMLDSig Structure: Understanding the
<ds:Signature>element, includingSignedInfo,CanonicalizationMethod,SignatureMethod,Reference,DigestMethod,DigestValue, andSignatureValue. - Canonicalization: Understanding why XML needs to be canonicalized before signing.
- Signing with
OpenSSLandlibxml2: Integrating these libraries to perform the complex XML signing process. - Signature Verification: On the SP side, parsing the signature, extracting the public key, and using
OpenSSLto verify.
Key Concepts:
- XML Digital Signature (XMLDSig): “W3C Recommendation: XML-Signature Syntax and Processing”.
- X.509 Certificates: “RFC 5280 - Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile”.
- Public Key Cryptography: The use of key pairs (private for signing, public for verifying).
- XML Canonicalization (C14N): The process of normalizing XML for signing.
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Project 3, basic understanding of public key cryptography concepts.
Real world outcome:
Your generated SAMLResponse will now include a <ds:Signature> block within the Assertion element:
<samlp:Response ...>
<saml:Assertion ...>
<ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#">
<ds:SignedInfo>
<ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/>
<ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/>
<ds:Reference URI="#_assertion_id">
<ds:Transforms>
<ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/>
<ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/>
</ds:Transforms>
<ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
<ds:DigestValue>...</ds:DigestValue>
</ds:Reference>
</ds:SignedInfo>
<ds:SignatureValue>...</ds:SignatureValue>
<ds:KeyInfo>
<ds:X509Data>
<ds:X509Certificate>...</ds:X509Certificate>
</ds:X509Data>
</ds:KeyInfo>
</ds:Signature>
<!-- ... other assertion content ... -->
</saml:Assertion>
</samlp:Response>
And your SP will be able to confirm that the signature is valid.
Implementation Hints:
- Generate Keys/Cert: Use
openssl genrsa -out idp-key.pem 2048andopenssl req -new -x509 -key idp-key.pem -out idp-cert.pem -days 365. libxml2andOpenSSLIntegration: This is where it gets complex. Thelibxml2library has anxmlSecextension (often a separate library likelibxmlsec1) specifically for XMLDSig, but you can also do it more manually withOpenSSL’s raw cryptographic functions andlibxml2’s DOM manipulation. For a deep understanding, manually assembling the<ds:Signature>and calculating digests/signatures withOpenSSLis recommended.- Canonicalization is crucial: The XML needs to be in a precisely defined canonical form before its digest is calculated and signed.
libxml2orxmlSeccan handle this. - Verification: The SP needs the IdP’s public certificate (e.g.,
idp-cert.pem). It parses theSignedInfo, re-calculates the digest, and then verifiesSignatureValueusing the public key extracted from the certificate.
Learning milestones:
- You can generate self-signed X.509 certificates and private keys. → You understand basic certificate management.
- Your IdP can construct a SAML
Assertionwith a correctly formed, but perhaps not fully valid, XML Digital Signature structure. → You understand XMLDSig elements. - Your IdP can generate a valid digital signature for an
Assertionusing its private key. → You’ve implemented SAML signing. - Your SP can verify the digital signature on an incoming
SAMLResponseusing the IdP’s public certificate. → You’ve implemented SAML signature verification.
Project 5: Implement the Assertion Consumer Service (ACS) on the SP
- File: LEARN_SAML_DEEP_DIVE.md
- Main Programming Language: C
- Alternative Programming Languages: Python, Node.js
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 3: Advanced
- Knowledge Area: HTTP / Security Validation
- Software or Tool: Basic C web server (socket programming),
libxml2,OpenSSL - Main Book: “SAML 2.0 Identity Federation” by Aaron Sachs, et al.
What you’ll build: Extend your SP program (from Project 2) into a fully functional web server. It will implement a specific HTTP POST endpoint (the Assertion Consumer Service, or ACS URL) that receives the IdP’s SAMLResponse. It will then:
- Extract and Base64-decode the
SAMLResponse. - Parse the
SAMLResponseXML. - Crucially, validate the IdP’s digital signature on the
Assertion(from Project 4). - Perform other essential SAML validations (Audience, NotBefore/NotOnOrAfter, Issuer).
- Extract the user’s
NameIDand establish a simulated local session for the user.
Why it teaches the fundamentals: This is the heart of the SP’s role. All the security checks and identity extraction happen here. You’ll understand how the SP confirms the legitimacy of the IdP’s assertion and uses it to grant access to its own resources.
Core challenges you’ll face:
- Setting up a robust C web server: Handling HTTP POST bodies, not just GET query parameters.
- Extracting POST data: Parsing
application/x-www-form-urlencodedor multipart forms. - SAML Response Validation: Implementing a checklist of SAML security checks.
- Error Handling: What to do if any validation fails (e.g., redirect to an error page).
- Session Management: A basic (non-cryptographic) simulation of an SP session.
Key Concepts:
- Assertion Consumer Service (ACS): The endpoint on the SP that receives the IdP’s SAML Response.
- SAML Validation Checks: Audience, IssueInstant, NotBefore/NotOnOrAfter conditions, SubjectConfirmation.
- SP Session Management: How the SP establishes a local authenticated state for the user.
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Project 4.
Real world outcome: Your SP server, after receiving a valid SAML Response from the IdP (e.g., from Project 3), will display a personalized welcome page to the user.
# User's browser flow:
# 1. User navigates to https://sp.example.com/protected
# 2. SP redirects to IdP (AuthnRequest)
# 3. User authenticates at IdP
# 4. IdP redirects back to SP's ACS (SAMLResponse via POST)
# SP server output in logs:
INFO: Received SAMLResponse at ACS.
INFO: Validating signature... Signature valid!
INFO: Validating AudienceRestriction... Audience OK.
INFO: Validating NotBefore/NotOnOrAfter... Conditions OK.
INFO: User 'testuser' authenticated. Establishing session.
# Browser displays:
"Welcome, testuser! You are now logged into the Service Provider."
Implementation Hints:
- Your SP server needs to listen for POST requests on the
/acspath. - Parse the HTTP POST request body to extract the
SAMLResponse(andRelayState). You’ll need to handleContent-Lengthand read the full body. - Base64-decode the
SAMLResponse. - Parse the XML using
libxml2. - Call your signature verification logic (from Project 4). This is the most critical check.
- Implement checks for
AudienceRestriction(ensure the assertion is for this SP),IssueInstant(not too old),NotBefore/NotOnOrAfter(assertion validity period). - If all checks pass, extract the
NameIDfrom the assertion. - Send an HTTP 302 Redirect to a success page (
/welcome) or a 200 OK HTML response directly, displaying the user’sNameID.
Learning milestones:
- Your SP server can receive and parse HTTP POST requests. → You’re handling more complex web requests.
- Your SP can correctly extract and Base64-decode the
SAMLResponsefrom the POST body. → You’re processing incoming SAML messages. - Your SP successfully validates the IdP’s digital signature and other SAML conditions. → You’ve implemented robust SAML security.
- Your SP can extract the user’s
NameIDfrom the assertion and simulate a logged-in state. → You’ve completed the core SP functionality.
Summary of Projects
| Project | Key Tools/Concepts | Difficulty |
|---|---|---|
| Project 1: Basic XML Structure Generator/Parser | libxml2, XML Namespaces |
Beginner |
| Project 2: Implement HTTP Redirect Binding | zlib, Base64, URL Encoding |
Intermediate |
| Project 3: Implement HTTP POST Binding | Basic C Web Server, HTML Forms | Intermediate |
| Project 4: Digital Signatures for SAML Assertions | OpenSSL, XMLDSig, X.509 |
Advanced |
| Project 5: Implement the Assertion Consumer Service | HTTP POST, SAML Validation | Advanced |