Security Architecture and Engineering Reference Guide
CISSP Domain 3 · 10 topics · Concepts, real-world examples, and controls
Research, implement and manage engineering processes using secure design principles
Secure design principles are foundational architectural decisions that, when applied consistently, eliminate entire categories of vulnerability before code is written or systems are deployed. They are not optional refinements applied at the end — they are the structural logic of security architecture. Retrofitting them into a deployed system costs orders of magnitude more than building them in.
Select a principle to see its full definition and a real-world example of what happens when it is absent.
Least privilege
Minimum access required
Defense in depth
Layered controls
Secure defaults
Safe out of the box
Fail securely
Failure = deny access
Segregation of duties
No single point of trust
Keep it simple
Complexity = attack surface
Zero trust
Never trust, always verify
Privacy by design
Privacy built in, not bolted on
Shared responsibility
Cloud security division
Threat modeling
Identify threats by design
SASE
Network + security edge
Least privilege
Every user, process, and system component should operate with only the minimum access rights required to perform its function — and for no longer than necessary. Applies to human users, service accounts, application processes, and system components equally. Least privilege limits the blast radius of any compromise: if an attacker gains control of an account, they can only do what that account can do.
Key application
Privilege creep — the gradual accumulation of permissions over time — is the enemy of least privilege. Access must be reviewed and revoked when roles change or access is no longer needed.
Real-world example
The 2020 SolarWinds breach succeeded in part because the SolarWinds Orion update mechanism ran with SYSTEM-level privileges on every machine it was installed on. When SUNBURST malware was delivered via the update, it inherited those privileges — gaining immediate unrestricted access to every affected system. A least-privilege update service that could only install to specific directories and communicate to specific endpoints would have severely constrained what the malware could do after execution.
Defense in depth
Implementing multiple overlapping security controls so that no single control failure results in a breach. If one layer fails, the next layer catches what the first missed. Controls should be diverse — different technologies, different vendors, different principles — so that a vulnerability in one does not undermine all. Perimeter, network, endpoint, application, data, and monitoring layers all working together.
Key application
Defense in depth is not about redundant identical controls — it is about layered, diverse controls. A second firewall from the same vendor with the same configuration is not defense in depth. Network segmentation, endpoint detection, application allowlisting, and data encryption together are defense in depth.
Real-world example
The 2013 Target breach bypassed perimeter controls via an HVAC vendor’s credentials. Defense in depth would have stopped the attack at subsequent layers: network segmentation (HVAC systems on a separate VLAN unreachable from POS systems), application controls (POS systems not accepting connections from unauthenticated internal hosts), and data layer (cardholder data encrypted end-to-end, not decryptable at the POS terminal). Target had perimeter controls; it lacked the inner layers that defense in depth requires.
Secure defaults
Systems should be secure in their default configuration — requiring explicit action to reduce security, not to enable it. Accounts should be disabled by default, ports should be closed by default, services should be off by default, and permissions should be denied by default. The most common misconfiguration vulnerabilities exist precisely because vendors ship products with insecure defaults for ease of initial setup.
Key application
Default credentials are the classic secure defaults failure. Thousands of IoT devices ship with admin/admin. The Shodan search engine indexes them continuously. Every internet-connected device with unchanged default credentials is a publicly accessible attack surface.
Real-world example
The Mirai botnet (2016) infected over 600,000 IoT devices — cameras, routers, DVRs — almost entirely by scanning for devices using factory-default credentials. No sophisticated exploitation. No zero-days. Just default usernames and passwords that manufacturers had never required users to change, and users had never thought to change. The resulting DDoS attack took down major DNS infrastructure and made Twitter, Netflix, and Reddit unreachable across the US east coast. Secure defaults — requiring password change at first login — would have prevented the vast majority of infections.
Fail securely
When a system or control fails, it should fail in a state that denies access rather than grants it. A firewall that crashes should default to blocking all traffic, not passing it. A door lock with a power failure should remain locked (fail-closed), not unlock (fail-open) — unless it is a fire exit, where life safety overrides security. Every failure mode must be explicitly designed; undesigned failure modes tend to fail open.
Key application
Fail-open is common in availability-focused systems where outages are costly. Authentication bypasses often exploit error handling — when an authentication system throws an unexpected exception, poorly coded applications sometimes grant access rather than return an error. Input validation failures should always deny the request, never process it with partial validation.
Real-world example
In 2011, a certificate authority (DigiNotar) was compromised and fraudulent SSL certificates were issued for Google.com. Browsers had OCSP (Online Certificate Status Protocol) checks to detect revoked certificates — but many implementations were configured to fail-open: if the OCSP server was unreachable, the browser accepted the certificate anyway. Iranian users whose OCSP queries were blocked by government infrastructure were left with no protection. Fail-open OCSP checking rendered the revocation mechanism useless precisely when it was needed most.
Segregation of duties (SoD)
No single person or process should have sufficient access to carry out a sensitive operation from start to finish without oversight. Divides critical functions across multiple individuals so that fraud, error, or compromise of any one person cannot alone result in a significant loss. Applies to financial transactions, system administration, code deployment, and cryptographic key management.
Key application
SoD is enforced through role design — not just policy. If a system allows a single user account to both initiate and approve a transaction, SoD has not been implemented regardless of what the policy says. The technical control must enforce the separation.
Real-world example
The 2011 UBS rogue trader scandal saw Kweku Adoboli accumulate $2.3 billion in unauthorized trading losses. Investigation revealed he had been able to both execute trades and book the corresponding hedge positions — eliminating the independent verification that should have caught discrepancies. SoD in financial systems requires that the person executing a trade cannot also be the person responsible for confirming and reconciling it. When SoD is implemented only in policy but not enforced by system controls, it provides no protection against an insider who knows how to exploit the gap.
Keep it simple and small
Complexity is the enemy of security. Every unnecessary feature, option, API endpoint, service, or dependency is a potential attack surface. Simple systems are easier to understand, audit, and defend. Small components have smaller trusted computing bases (TCBs) and are easier to verify as secure. The principle applies to code, architecture, and policy.
Key application
Complexity accumulates invisibly over time. Systems start simple and grow complex through incremental feature additions, integrations, and technical debt. Regular architecture reviews to identify and eliminate unnecessary complexity are as important as adding new security controls.
Real-world example
The Heartbleed vulnerability (OpenSSL 2014) exploited a small, unnecessary complexity in the TLS heartbeat extension — a feature that kept connections alive by sending small “I’m still here” messages. The implementation had no bounds checking on the length field, allowing attackers to read up to 64KB of server memory per request. The feature provided marginal benefit; its implementation introduced catastrophic risk. Approximately two-thirds of all HTTPS servers on the internet were vulnerable. Removing or simplifying the heartbeat extension would have prevented the vulnerability entirely.
Zero trust
No user, device, or network location should be implicitly trusted. Every access request — regardless of whether it originates inside or outside the corporate network — must be authenticated, authorized, and continuously validated before access is granted. Zero trust replaces the perimeter model (“inside the firewall = trusted”) with an identity-centric model (“verified identity + verified device + verified context = access”).
Key application
Zero trust is an architectural principle, not a product. It requires: strong identity verification (MFA), device health verification, least-privilege access, microsegmentation, and continuous monitoring of all traffic — even east-west traffic within the network.
Real-world example
Google’s BeyondCorp initiative, launched after the Operation Aurora attacks (2010), moved Google’s entire workforce off the VPN model. Instead of trusting users because they were on the corporate network, every access request was evaluated based on device health certificates, user identity, and request context — regardless of physical location. By 2017, most Google employees could work securely from any network without a VPN. This architecture proved its value in 2020 when the entire workforce shifted to remote work overnight — no VPN infrastructure, no bottlenecks.
Privacy by design
Privacy protections must be built into systems and processes from the start — not added as a compliance layer after the fact. The seven foundational principles (Ann Cavoukian, 1990s): proactive not reactive; privacy as the default; privacy embedded in design; full functionality (not security vs. privacy); end-to-end security; visibility and transparency; respect for user privacy. Privacy by design is now a regulatory requirement under GDPR Article 25.
Key application
Privacy by design requires data minimization at the schema level — if a field doesn’t need to exist, it shouldn’t be in the database. Pseudonymization and anonymization at the architecture level. Consent management built into user flows, not appended afterward.
Real-world example
Apple’s implementation of on-device processing for Siri, Face ID, and Health data is a structural privacy by design decision — sensitive processing happens on the device rather than in Apple’s cloud, meaning the data never leaves the device and Apple cannot access it even if compelled. This architectural choice eliminates entire categories of privacy risk (cloud breach, government access, insider threat) by ensuring the data is simply never in Apple’s possession. Privacy was not added to the architecture — it defined the architecture.
Shared responsibility
In cloud environments, security responsibilities are divided between the cloud service provider (CSP) and the customer according to the service model. The CSP secures the infrastructure; the customer secures what they build and deploy on it. The boundary shifts depending on IaaS, PaaS, or SaaS. Misunderstanding this boundary — assuming the CSP secures everything — is one of the most common causes of cloud security incidents.
Key application
IaaS: CSP secures physical, network, and hypervisor. Customer secures OS, runtime, applications, data, and access. PaaS: CSP additionally secures OS and runtime. Customer secures applications and data. SaaS: CSP secures almost everything. Customer secures data, user access, and configuration.
Real-world example
The 2019 Capital One breach occurred because a misconfigured Web Application Firewall — deployed by Capital One on AWS infrastructure — allowed an SSRF (Server-Side Request Forgery) attack to access the EC2 Instance Metadata Service and retrieve IAM credentials. AWS’s infrastructure was not breached. The misconfiguration was Capital One’s responsibility under the shared responsibility model. AWS explicitly documents that WAF configuration is a customer responsibility. The breach affected 100M+ customers and cost Capital One $80M in regulatory fines and $190M in a class action settlement.
Threat modeling
A structured process for identifying potential threats to a system during design, analyzing how they could be realized, and determining the most effective controls. Done at the architecture phase, threat modeling prevents entire categories of vulnerability from being built in. Core methodologies include STRIDE (component-level analysis), PASTA (business-risk-aligned), and Attack Trees (visual attack path analysis).
Key application
Threat modeling should be triggered by any significant architectural decision: new data flows, new external integrations, new authentication mechanisms, changes to trust boundaries. The output is a set of prioritized findings — each a potential threat — paired with a recommended control and a risk acceptance decision for any finding not immediately addressed.
Real-world example
Microsoft’s Security Development Lifecycle (SDL) mandates threat modeling for every new product feature. When designing a new API endpoint, the team uses STRIDE to ask: can this endpoint be spoofed? Can the data it returns be used for information disclosure? Can it be used to elevate privileges? This process identified a privilege escalation path in Windows Hello’s PIN-reset flow during design — before the feature shipped. Post-launch discovery of the same issue would have required an emergency patch and public CVE disclosure. Design-time discovery cost an afternoon.
Secure Access Service Edge (SASE)
SASE (pronounced “sassy”) is an architectural framework that converges wide-area networking (SD-WAN) with security services (CASB, FWaaS, ZTNA, SWG) delivered as a unified cloud service. Rather than routing traffic through a central data center for security inspection, SASE enforces security at the edge — closest to the user and resource. Eliminates the backhauling of remote user traffic to central VPN concentrators.
Key application
Components: SD-WAN (intelligent routing), CASB (cloud security broker), FWaaS (firewall as a service), ZTNA (zero trust network access), SWG (secure web gateway). Each enforces policy at the network edge without requiring traffic to traverse a central hub.
Real-world example
A global retail company with 500 branch locations had been routing all internet-bound traffic through a central data center for security inspection — adding 80–200ms latency to every cloud application. After adopting SASE, each branch connects directly to cloud applications through a local PoP with full security inspection. Latency dropped by 60%, VPN infrastructure was decommissioned, and security policy is now enforced consistently across all branches without backhauling. The architecture also made the 2020 shift to remote work seamless — SASE treats remote users identically to branch users.
Understand the fundamental concepts of security models
Formal security models provide mathematically rigorous frameworks for defining and enforcing security properties. They translate high-level security policies into precise rules that can be implemented and verified. Each model is optimized for a specific security property — confidentiality or integrity — and understanding which model applies to which scenario is a core CISSP competency.
Bell-LaPadula
Confidentiality-focusedGovernment/military origin. Prevents unauthorized disclosure. Subjects cannot read up (No Read Up) or write down (No Write Down). Data flows upward — toward higher classifications only.
Simple rule: No read up (NRU) · * rule: No write down (NWD)
Biba
Integrity-focusedInverse of Bell-LaPadula. Prevents unauthorized modification. Subjects cannot read down (no reading of lower-integrity data) or write up (no writing to higher-integrity objects).
Simple rule: No read down (NRD) · * rule: No write up (NWU)
Clark-Wilson
Integrity-focusedCommercial integrity model. Uses Constrained Data Items (CDIs), Unconstrained Data Items (UDIs), Integrity Verification Procedures (IVPs), and Transformation Procedures (TPs). Enforces well-formed transactions and separation of duties.
Well-formed transactions · Separation of duties
Brewer-Nash (Chinese Wall)
Conflict of interestPrevents conflicts of interest. Once a subject accesses data in one company, they cannot access data from a competitor in the same industry class. Used in financial services and consulting contexts.
Dynamic access control based on history
Graham-Denning
Access controlDefines eight protection rights for managing subjects, objects, and access rights. Focuses on how access rights are created, deleted, and transferred — the meta-model of access control.
8 primitive rights: create/delete subject & object, grant/transfer/read/grant rights
Take-Grant model
Access controlRepresents access rights as a directed graph. Subjects can take rights from others or grant rights to others according to four rules: take, grant, create, remove. Used to analyze how privileges propagate through a system.
Four operations: take, grant, create, remove
🌐 Real-world example — Bell-LaPadula in practice
The US Department of Defense’s Trusted Computer System Evaluation Criteria (TCSEC / Orange Book) mandated Bell-LaPadula compliance for systems processing classified information. A Secret-cleared analyst can read Secret and Confidential documents but cannot read Top Secret documents (No Read Up) and cannot write a note into a Confidential document (No Write Down — doing so would risk contaminating lower-classification material with higher-classification content). Every modern Multi-Level Security (MLS) system — used in intelligence communities globally — implements Bell-LaPadula as its core access control logic.
Select controls based upon systems security requirements
Control selection is not a one-size-fits-all exercise. Controls must be matched to the specific security requirements of the system — its classification level, threat environment, data sensitivity, availability requirements, and applicable regulatory obligations. A control appropriate for a public-facing web server is not appropriate for an air-gapped classified system, and vice versa.
Control selection framework
Identify requirements — what must the system protect (CIA priorities), under what threat model, for what regulatory regime? Requirements drive control selection; controls should never be selected before requirements are defined.
Select a baseline — use an appropriate framework (NIST SP 800-53, ISO 27002, PCI-DSS) to define the minimum control set for the system’s risk category. Baselines provide a defensible starting point.
Scope and tailor — eliminate controls that don’t apply to the system’s environment; adjust controls to fit operational context while maintaining required security posture. Document all deviations with justification.
Compensating controls — where a required control cannot be implemented, a compensating control provides equivalent or greater protection. Must be documented, time-bounded, and approved by the risk owner.
🌐 Real-world example
A hospital’s legacy MRI imaging system runs on Windows XP — long past end of support and unable to be patched or upgraded without recertification costing millions. The NIST SP 800-53 control requiring current patch status cannot be implemented. The hospital selects compensating controls: network isolation (the MRI system on a dedicated VLAN with no internet routing), application allowlisting (only the MRI software can execute), host-based monitoring by a separate security agent, and enhanced logging of all access. The risk is formally accepted by the CISO with a documented timeline for eventual replacement. This is proper compensating control documentation — not informal tolerance of unmanaged risk.
Understand security capabilities of Information Systems
Modern hardware and operating systems provide built-in security capabilities that, when correctly leveraged, provide stronger guarantees than software-only controls. Understanding these capabilities — and their limitations — is essential for designing systems that use the full available security stack.
Trusted Platform Module (TPM)
A dedicated microcontroller that stores cryptographic keys, certificates, and measurements of system state. Enables remote attestation (proving the system has not been tampered with), full-disk encryption key storage, and secure boot. Keys stored in TPM cannot be extracted by software — the TPM is a hardware security boundary.
Memory protection
Hardware and OS mechanisms preventing processes from accessing memory outside their allocated space. ASLR (Address Space Layout Randomization) randomizes memory addresses to prevent exploitation. DEP/NX (Data Execution Prevention / No-Execute) marks memory regions as non-executable, preventing code injection attacks. Stack canaries detect buffer overflow attempts.
Secure boot
A UEFI standard that verifies the cryptographic signature of each component in the boot sequence before executing it. Prevents boot-level rootkits and malicious firmware from loading before the OS. The trust chain begins with the UEFI firmware (typically signed by the hardware manufacturer) and extends to the bootloader, OS kernel, and drivers.
Hardware Security Module (HSM)
A dedicated hardware device for cryptographic key generation, storage, and operations. Keys never leave the HSM in plaintext. Tamper-evident and tamper-resistant. Used for PKI root CAs, payment systems, code signing, and any application where key compromise would be catastrophic.
Encryption/decryption
Hardware-accelerated encryption (AES-NI instructions in Intel/AMD CPUs) enables full-disk encryption, TLS, and database encryption with minimal performance overhead. Self-encrypting drives (SEDs) perform encryption in the drive controller — transparent to the OS but protecting data even if the drive is physically removed.
Virtualization security
Hypervisors provide hardware-enforced isolation between virtual machines. VMs on the same physical host cannot access each other’s memory or storage without explicit configuration. Virtual TPMs extend hardware security guarantees to VMs. Confidential computing (Intel TDX, AMD SEV) encrypts VM memory even from the hypervisor.
🌐 Real-world example — TPM in enterprise use
Microsoft’s Windows 11 requires a TPM 2.0 chip as a hardware prerequisite. The requirement exists because BitLocker (full-disk encryption) uses the TPM to seal the encryption key to the measured state of the system. If a laptop is stolen and the attacker attempts to boot from an external device or modify the bootloader to extract the key, the TPM detects the state change and refuses to release the key — the disk remains encrypted and unreadable. Without the TPM, the key must be stored in software or entered manually at every boot — neither approach provides equivalent security against physical theft.
Assess and mitigate the vulnerabilities of security architectures, designs, and solution elements
Each system architecture type introduces specific vulnerability patterns. Security architects must understand the unique attack surface and applicable mitigations for each environment they design or assess.
Select a system type to see its specific vulnerabilities and mitigations.
Cloud-based systems (IaaS / PaaS / SaaS)
Key vulnerabilities: misconfiguration (publicly exposed S3 buckets, overpermissioned IAM roles), insecure APIs, data residency issues, shared tenancy side-channels, inadequate logging and monitoring, shadow IT cloud usage, and misunderstanding of the shared responsibility model.
Mitigations
Cloud Security Posture Management (CSPM) tools for continuous misconfiguration detection; Cloud Access Security Broker (CASB) for shadow IT visibility; strict IAM policies with least privilege and MFA for all privileged accounts; CloudTrail/equivalent audit logging; encryption of all data at rest with customer-managed keys; CSPM benchmarking against CIS Foundations.
Real-world example
The 2019 Capital One breach: misconfigured WAF allowed SSRF attack to retrieve IAM credentials from the EC2 metadata service, granting access to 100M+ customer records in S3. Root cause: overpermissioned IAM role attached to EC2 instance + WAF misconfiguration. Both were customer configuration responsibilities under the shared responsibility model.
Industrial Control Systems (ICS) / SCADA
Key vulnerabilities: legacy protocols with no authentication (Modbus, DNP3), air-gap assumptions eroded by IT/OT convergence, limited patch cycles (systems cannot be taken offline), lack of encryption, default credentials, and remote access paths added for operational convenience without security review.
Mitigations
IT/OT network segmentation with unidirectional data diodes where possible; compensating controls for unpatchable systems (application allowlisting, network access controls); anomaly detection (Claroty, Dragos); strict change management; regular assessments against IEC 62443; incident response planning specific to OT environments.
Real-world example
The 2021 Oldsmar water treatment attack: an attacker remotely accessed the plant’s HMI via TeamViewer (a remote access tool that had been left installed and active) and increased sodium hydroxide levels to dangerous concentrations. An operator noticed the cursor moving and manually reversed the change within minutes. No authentication, no access logging, no alerting on chemical setpoint changes outside normal ranges. The attack required no exploitation — just valid credentials to a publicly accessible remote desktop.
Internet of Things (IoT)
Key vulnerabilities: default credentials never changed, no update mechanism or infrequent patches, minimal compute for security controls, unencrypted communications, wide attack surface (millions of devices), physical accessibility, and weak or absent authentication between devices.
Mitigations
Network segmentation (IoT devices on isolated VLANs with no lateral movement path to corporate systems); mandatory credential change at enrollment; device identity management; encrypted communications (TLS/DTLS); regular firmware update processes; asset inventory of all connected devices; monitoring for anomalous behavior (unexpected outbound connections).
Real-world example
Mirai botnet (2016): 600,000+ IoT devices infected via default credentials, used to launch a 1.2Tbps DDoS attack against Dyn DNS — taking down Twitter, Netflix, Reddit, GitHub. No exploitation required. Devices were discovered via Shodan, credentials tried from a list of 61 factory defaults. Default credential requirements in regulations (UK PSTI Act 2023) directly resulted from this incident.
Containerization
Key vulnerabilities: container escape to host OS (containers share the host kernel — a kernel vulnerability affects all containers), insecure base images with known vulnerabilities, overprivileged containers (running as root), insecure secrets management (hardcoded credentials in images or environment variables), and insecure container registries.
Mitigations
Run containers as non-root users; use minimal base images (distroless/scratch); scan images for vulnerabilities at build and registry time (Trivy, Snyk); use secrets management systems (Vault, Kubernetes Secrets); enable security profiles (AppArmor, seccomp); network policies to control inter-container traffic; runtime security monitoring (Falco).
Real-world example
In 2019, a container escape vulnerability in runc (CVE-2019-5736) — the container runtime used by Docker and Kubernetes — allowed a malicious container to overwrite the host runc binary and gain root access to the host system. Any organization running Docker or Kubernetes was vulnerable. The fix required updating runc — but organizations running unpinned images or without container vulnerability scanning had no visibility into whether they were exposed.
Microservices and APIs
Key vulnerabilities: broken object-level authorization (BOLA — API returns objects the caller shouldn’t see), broken authentication, excessive data exposure (APIs returning more fields than needed), lack of rate limiting enabling enumeration and brute force, injection attacks through API parameters, insecure direct object references, and insecure inter-service communication.
Mitigations
OAuth 2.0 / OpenID Connect for API authentication and authorization; enforce object-level authorization checks server-side on every request; API gateway for rate limiting, authentication enforcement, and input validation; mTLS (mutual TLS) for service-to-service communication; API inventory and discovery; OWASP API Security Top 10 as a baseline testing checklist.
Real-world example
The Facebook (2019) API vulnerability allowed attackers to access any user’s private posts by passing another user’s ID to a specific API endpoint — a classic BOLA vulnerability. The API checked that the caller was authenticated but did not check whether the caller was authorized to access the requested user’s data. 29 million user accounts were affected. Authorization checks must be applied at the object level on every API call — authentication alone is insufficient.
Database systems
Key vulnerabilities: SQL injection (unparameterized queries), excessive database user privileges, unencrypted sensitive columns, audit logging disabled, direct internet exposure of database ports, default or weak database credentials, and unsecured database backups.
Mitigations
Parameterized queries / prepared statements (eliminates SQL injection); least-privilege database accounts (application accounts have SELECT/INSERT only on required tables — no DDL); column-level encryption for sensitive fields; Database Activity Monitoring (DAM); no direct internet access to database ports; encrypted backups stored separately from production; regular database vulnerability assessments.
Real-world example
The 2008 Heartland Payment Systems breach — affecting 130 million payment card records — was initiated via SQL injection against a web application. The application’s database account had excessive privileges allowing the attacker to install malware on the database server that intercepted card data in transit. Parameterized queries would have prevented the initial SQL injection; least-privilege database accounts would have prevented the subsequent malware installation.
Embedded systems
Key vulnerabilities: hardcoded credentials and backdoors in firmware, no update mechanism, unencrypted firmware (reversible via binwalk/Ghidra), debug interfaces left enabled in production (JTAG, UART), outdated third-party libraries, limited or no runtime security monitoring, and physical access attacks.
Mitigations
Secure boot to verify firmware integrity; encrypted firmware images; disable debug interfaces before production deployment; firmware signing and update authentication; minimal attack surface (disable unused services and interfaces); secure coding practices for C/C++ (bounds checking, safe functions); hardware security for physical attack resistance; regular firmware security assessments.
Real-world example
In 2022, security researchers discovered hardcoded backdoor credentials in Hikvision IP cameras (CVE-2021-36260) — a command injection vulnerability in the web server allowing unauthenticated remote code execution as root. Over 80,000 cameras remained publicly exploitable months after the patch was released, because embedded device owners rarely apply firmware updates. The vulnerability was discovered via firmware analysis — the binary was not encrypted, allowing researchers to identify the flaw through static analysis.
Virtualized systems
Key vulnerabilities: VM escape (hypervisor vulnerability allowing a VM to access the host or other VMs), VM sprawl (unmanaged VMs accumulating outside change management), snapshot security (snapshots contain full disk state including encryption keys and credentials), insecure VM migration (live migration traffic is unencrypted by default in some hypervisors), and hypervisor compromise.
Mitigations
Keep hypervisor software patched promptly; encrypt VM migration traffic; restrict hypervisor management access (dedicated management network, MFA, least privilege); VM lifecycle management to prevent sprawl; encrypt VM snapshots at rest; monitor for unusual inter-VM traffic; use confidential computing (Intel TDX, AMD SEV) for sensitive workloads requiring isolation even from the hypervisor.
Real-world example
CVE-2018-3646 (L1 Terminal Fault / Foreshadow): a speculative execution vulnerability in Intel processors allowed a malicious VM to read data from SGX enclaves and potentially from other VMs on the same physical host by exploiting L1 data cache state. Cloud providers had to apply hypervisor patches and, in some cases, disable hyper-threading — a performance impact of 10–30%. The vulnerability required no software exploitation — physical proximity via shared hardware was sufficient. Confidential computing architectures (encrypted memory per VM) mitigate this class of attack by ensuring VM memory is encrypted even from the hypervisor.
Select and determine cryptographic solutions
Cryptography is the mathematical foundation of most security controls — confidentiality, integrity, authentication, and non-repudiation all rely on it. Selecting the right cryptographic solution requires understanding the strengths, limitations, and appropriate use cases of each method, as well as the lifecycle management obligations of keys and algorithms.
Symmetric encryption
Single shared key for encryption and decryption. Fast — suitable for bulk data encryption. Key distribution is the hard problem: how do two parties securely share the key without a secure channel? Secure key exchange typically uses asymmetric encryption to bootstrap symmetric sessions.
AES-256ChaCha203DES (legacy)
Use for: encrypting data at rest, bulk data transfer after key exchange (TLS record layer), full-disk encryption.
Asymmetric encryption
Key pair: public key (freely distributed) encrypts or verifies; private key (secret) decrypts or signs. Solves the key distribution problem — no secure channel needed to share the public key. Significantly slower than symmetric — not suitable for bulk data encryption. Used to exchange symmetric keys and for digital signatures.
RSA-2048+Diffie-HellmanDSA
Use for: TLS handshake (key exchange), digital signatures, certificate issuance, email encryption (PGP/S-MIME).
Elliptic curve cryptography (ECC)
Asymmetric cryptography based on elliptic curve mathematics. Achieves equivalent security to RSA with much shorter key lengths — a 256-bit ECC key provides roughly equivalent security to a 3072-bit RSA key. Preferred for constrained environments (mobile, IoT) and modern TLS implementations for performance and forward secrecy.
ECDSAECDHCurve25519
Use for: TLS 1.3 key exchange, code signing, mobile authentication, certificate keys where size matters.
Quantum and post-quantum
Quantum computers can break RSA and ECC using Shor’s algorithm — rendering all current asymmetric cryptography insecure. NIST finalized post-quantum cryptographic standards in 2024 (CRYSTALS-Kyber for key exchange, CRYSTALS-Dilithium for signatures). Symmetric encryption (AES-256) remains secure against quantum attacks. Quantum Key Distribution (QKD) uses quantum mechanics to distribute keys with information-theoretic security — physically impossible to intercept without detection.
CRYSTALS-KyberCRYSTALS-Dilithium
Harvest now, decrypt later: adversaries recording encrypted traffic today can decrypt it when quantum computers mature. Organizations with long-lived secrets (government, healthcare, finance) should plan migration to post-quantum algorithms now.
Public Key Infrastructure (PKI)
PKI is the framework of policies, procedures, hardware, software, and people needed to create, manage, distribute, use, store, and revoke digital certificates. It provides the trust infrastructure that makes asymmetric cryptography practical at scale.
Certificate Authority (CA)
Issues and signs digital certificates, binding a public key to an identity. Root CAs are implicitly trusted; intermediate CAs are signed by the root. The root CA’s private key must be protected as the foundation of the entire trust chain — typically in an offline HSM.
Certificate Revocation
CRL (Certificate Revocation List): a published list of revoked certificates. OCSP (Online Certificate Status Protocol): real-time revocation checking. OCSP Stapling: server includes a CA-signed OCSP response in the TLS handshake, eliminating the client’s need to contact the CA.
Certificate transparency
Public, append-only logs of all issued TLS certificates. Allows domain owners to monitor for unauthorized certificate issuance. Required by Chrome since 2018. Detected the DigiNotar compromise (2011) — a pivotal case in PKI accountability.
Key lifecycle management
Keys must be generated (strong RNG), distributed (secure channel), stored (HSM or encrypted key store), rotated (on schedule or after compromise), and destroyed (cryptographic erasure). An expired or compromised key with no rotation plan is a ticking vulnerability.
🌐 Real-world example — PKI compromise
In 2011, DigiNotar (a Dutch CA) was compromised by Iranian hackers who issued over 500 fraudulent certificates — including certificates for google.com, cia.gov, and mossad.gov.il. These certificates were used in a man-in-the-middle attack against approximately 300,000 Iranian Gmail users, intercepting their communications. DigiNotar’s root CA certificates were removed from all major browsers within days — effectively terminating the company. The entire Dutch government PKI (DigiPKI) had to be urgently migrated. A CA compromise does not just affect one certificate — it undermines the trust of every certificate that CA has ever issued.
Understand methods of cryptanalytic attacks
Understanding how cryptographic systems are attacked is essential for selecting appropriate algorithms, implementations, and deployment configurations. Many cryptographic vulnerabilities are not mathematical — they are implementation failures, side-channel leakages, or protocol design weaknesses.
| Attack | How it works | Real-world example / implication |
|---|---|---|
| Brute force | Exhaustive trial of all possible keys or passwords. Time complexity scales with key length — a 128-bit key requires 2¹²⁸ operations. | Feasible against weak passwords and short keys. A 40-bit DES key (now prohibited) can be brute-forced in seconds. Modern AES-256 is computationally infeasible to brute-force even with all current computing power. |
| Ciphertext only | Attacker has only ciphertext — no known plaintext. Works by identifying patterns (repeating blocks in ECB mode), statistical analysis, or weak key generation. | ECB mode encryption of images produces visible patterns because identical plaintext blocks produce identical ciphertext blocks. The “ECB penguin” is the canonical demonstration: an encrypted bitmap of a penguin still looks like a penguin in ECB mode. |
| Known plaintext | Attacker has both the plaintext and its corresponding ciphertext, and uses this to derive key information or decrypt other ciphertext. | WWII Enigma: Allied cryptanalysts exploited predictable message components (weather reports always started with “Wetter” — the German word for weather) as known plaintext to crack daily Enigma keys. Message standardization is a known-plaintext vulnerability. |
| Frequency analysis | Statistical analysis of character or symbol frequencies in ciphertext to deduce the key or plaintext. Effective against substitution ciphers where single characters are substituted. | English plaintext has predictable letter frequencies (e, t, a, o are most common). In a simple Caesar cipher, the most frequent character in ciphertext maps to ‘e’ in plaintext. Frequency analysis is why modern ciphers use diffusion — spreading the influence of each plaintext bit across the entire ciphertext. |
| Chosen ciphertext | Attacker can choose specific ciphertexts, have them decrypted, and use the results to learn about the key. Exploits weaknesses in padding schemes and decryption oracles. | PKCS#1 v1.5 padding oracle attacks (Bleichenbacher 1998) allowed RSA private key recovery through chosen ciphertext. SSL 3.0 POODLE attack (2014) exploited CBC padding oracles to decrypt session cookies. Padding oracle attacks are why authenticated encryption (AES-GCM) is now preferred over encrypt-then-MAC. |
| Side-channel | Exploits information leaked by the physical implementation — timing variations, power consumption, electromagnetic emissions, acoustic signals — rather than mathematical weaknesses in the algorithm. | Timing attacks on RSA: the time taken to compute RSA operations varies based on the key bit values. An attacker measuring thousands of decryption times can reconstruct the private key. All cryptographic implementations should use constant-time comparison functions for sensitive operations. |
| Fault injection | Deliberately induces hardware faults (voltage glitching, clock manipulation, laser fault injection) to cause cryptographic implementations to produce erroneous output that reveals key material. | Smart card attacks: by inducing a fault during RSA-CRT decryption, an attacker can obtain one signature that satisfies only half the CRT equations — from which the private key can be reconstructed mathematically. Tamper-resistant hardware (HSMs, smart cards) include fault detection circuits that wipe keys on detected fault conditions. |
| Man-in-the-Middle (MITM) | Attacker positions themselves between two communicating parties, intercepting and potentially modifying messages while each party believes they are communicating directly with the other. | SSL stripping attacks (Moxie Marlinspike, 2009): MITM intercepts HTTPS requests and delivers HTTP to the victim while maintaining HTTPS to the server. HSTS (HTTP Strict Transport Security) and HSTS preloading prevent SSL stripping by instructing browsers to always use HTTPS for specific domains. |
| Pass the hash | In Windows NTLM authentication, password hashes can be used directly to authenticate without knowing the plaintext password. Attacker extracts hash from memory (Mimikatz) and reuses it on other systems. | Pass-the-hash is a primary lateral movement technique in Windows environments. Mitigations: Protected Users security group (prevents NTLM credential caching), Credential Guard (virtualization-based protection of LSASS), Restricted Admin mode for RDP. Kerberos (with appropriate delegation settings) is not vulnerable to pass-the-hash. |
| Kerberos exploitation | Attacks on the Kerberos authentication protocol: Kerberoasting (requesting service tickets for offline cracking), Pass-the-Ticket (using stolen TGTs/service tickets), Golden Ticket (forging TGTs using the KRBTGT hash), Silver Ticket (forging service tickets). | Golden Ticket: attacker compromises the domain controller and extracts the KRBTGT account hash. With this, they can forge Kerberos TGTs valid for any user, to any service, for up to 10 years — a persistence mechanism that survives password resets. Mitigation: reset KRBTGT password twice after any suspected DC compromise; monitor for TGT lifetimes exceeding domain policy. |
| Ransomware | Malware that encrypts victim data using strong symmetric encryption and demands payment for the decryption key. Not a cryptanalytic attack on the encryption itself — exploits deployment and key management. The attacker holds the only copy of the decryption key. | Colonial Pipeline (2021): DarkSide ransomware encrypted the IT network, forcing pipeline shutdown. The company paid $4.4M in Bitcoin (approximately $2.3M subsequently recovered by FBI). Mitigations: offline and immutable backups (attackers increasingly target backup systems first), network segmentation, privileged account protection, EDR, and incident response planning. |
Apply security principles to site and facility design
Physical security is the outermost layer of defense in depth. Technical controls are irrelevant if an attacker can walk up to a server and attach a USB drive, or if a natural disaster destroys the building. Site selection, facility design, and physical access controls must be treated with the same rigor as network and application security.
Site selection principles
Crime Prevention Through Environmental Design (CPTED)
Design the physical environment to reduce opportunities for crime and unauthorized access. Natural surveillance (clear sightlines), natural access control (directing movement through designed paths), territorial reinforcement (clear boundaries between public and private space).
Visibility and natural barriers
Avoid locating critical facilities in areas with high crime rates, flood plains, flight paths, or proximity to high-risk industrial sites. Natural geographic barriers (hills, rivers) and designed barriers (berms, bollards) provide protection against vehicle-borne threats and flooding.
Utility resilience
Site must have access to multiple utility feeds (power, water, telecom) from different directions. Single utility entry points are SPOFs. Telecom entry should be diverse — multiple physical paths from different providers, entering the building at different points.
Layered perimeters
Defense in depth applies to physical security: outer perimeter (fencing, bollards), building perimeter (walls, secure entry points), interior zones (access-controlled areas by sensitivity), inner sanctum (server room, data center, vault). Each layer requires an additional authentication event.
🌐 Real-world example — physical breach
In 2019, a Chinese national drove through the gate of President Trump’s Mar-a-Lago resort by tailgating an authorized vehicle. She passed through multiple physical security checkpoints before being stopped. The incident demonstrated that physical security is only as strong as its most permissive entry point. Security guards following challenge-all-tailgating procedures, mantrap vestibules, and anti-tailgating barriers are the physical security equivalents of multi-factor authentication — each additional layer reduces the probability that a simple social engineering or physical bypass succeeds.
Design site and facility security controls
Facility security controls protect the physical infrastructure within which all other security controls operate. A data center with perfect network security but inadequate fire suppression or power conditioning is one equipment failure away from catastrophic data loss.
Wiring closets / IDFs
Intermediate Distribution Facilities contain network equipment that bridges floors or zones. Physical access enables network tapping, cable disconnection, or equipment installation.
Controls: locked enclosures, access logging, no public access, cable management to detect tampering, regular inspections.
Server rooms / data centers
Highest-sensitivity physical space. Must protect against unauthorized access, environmental threats (heat, moisture, fire), power disruption, and physical theft of media.
Controls: biometric + badge access, mantrap, 24/7 CCTV, raised floors, precision cooling, raised floor hot/cold aisle separation, two-person integrity rule for sensitive operations.
Media storage
Backup tapes, optical media, and portable drives containing production data must be stored with controls equivalent to the data’s classification level.
Controls: locked fireproof safes or vaults, off-site rotation, encrypted media, media inventory and chain of custody, destruction certificates.
Evidence storage
Forensic evidence must maintain chain of custody from collection through disposition. Unauthorized access breaks chain of custody and may render evidence inadmissible.
Controls: locked, access-logged storage with limited key holders, evidence log for every access, tamper-evident packaging, environmental controls to preserve media.
HVAC
Heating, ventilation, and air conditioning controls temperature and humidity — both critical for equipment reliability. Failure of HVAC in a data center can cause equipment failure within minutes in high-density environments.
Controls: redundant HVAC units (N+1 minimum), automated temperature/humidity monitoring and alerting, hot/cold aisle containment, backup cooling plans. Target: 18–27°C, 40–60% RH.
Fire suppression
Fire suppression in data centers cannot use water (destroys electronics) or CO₂ (asphyxiation risk). Requires clean agent systems that extinguish fire without damaging equipment or posing health risks.
Controls: FM-200 or Novec 1230 clean agent systems, early warning smoke detection (VESDA — very early warning smoke detection aspirating), pre-action sprinklers (require two triggers), staff evacuation procedures before discharge.
Power
Power disruption is one of the most common causes of data center outages. Even brief interruptions cause equipment failure and data corruption without power conditioning.
Controls: UPS (Uninterruptible Power Supply) for bridge power during generator startup; diesel generators for extended outages; dual utility feeds from separate substations; PDU-level redundancy (2N power distribution); regular load testing of UPS and generators.
Environmental threats
Natural and man-made threats to facility operations must be assessed and designed against. Site selection should avoid high-risk areas where possible.
Flood: raised floors, above-flood-level siting, water detection sensors. Earthquake: seismic-rated racks and UPS, flexible cable management. Natural disaster: geographic diversity in DR sites (100+ miles minimum).
🌐 Real-world example — power failure
In 2021, British Airways suffered a major IT outage traced to a power cut at its data center. The power loss caused servers to shut down uncleanly — corrupting data in ways that took days to fully identify and repair. Thousands of flights were cancelled and over 100,000 passengers were affected. Investigation revealed the power disruption was caused by a human error during maintenance work on the UPS — the very system designed to protect against power disruptions. The incident illustrated that physical infrastructure controls require procedural safeguards (change management, maintenance procedures with rollback plans) that are as rigorous as those applied to software changes.
Manage the information system lifecycle
Security must be integrated into every phase of the information system lifecycle — not added at the end as a compliance check. Systems that are designed without security must be retrofitted at far greater cost. Systems that are decommissioned without security controls leave data remnants that persist indefinitely.
Select a lifecycle phase to see its security obligations and common failure patterns.
Phase 1
Stakeholder needs
Phase 2
Requirements
Phase 3
Architecture
Phase 4
Development
Phase 5
Integration
Phase 6
Verification
Phase 7
Deployment
Phase 8
Operations
Phase 9
Retirement
Stakeholder needs and requirements
Security requirements must be elicited alongside functional requirements — not after the system is designed. Who are the users? What data will the system process? What regulations apply? What are the availability requirements? What is the threat model? Answers to these questions determine the security architecture before a line of code is written.
🌐 Real-world example
A healthcare startup building a patient portal fails to identify HIPAA applicability during stakeholder requirements gathering — classifying the project as “a simple web app.” Six months into development, a compliance review identifies 23 HIPAA technical safeguard requirements not incorporated into the architecture. The cost to retrofit encryption, access controls, and audit logging is triple what it would have been if identified at requirements stage. Security requirements discovered late cost exponentially more to address than security requirements identified at the start.
The security architecture and engineering chain
Every gap in this chain is where a design flaw, implementation vulnerability, or physical failure will expose what the architecture was supposed to protect.
Apply secure design principles from day one
Least privilege, defense in depth, secure defaults, fail securely, zero trust — built in at design, not retrofitted at launch.
Select the right security model
Bell-LaPadula for confidentiality, Biba for integrity, Clark-Wilson for commercial transactions. The model defines what the system must enforce.
Leverage hardware security capabilities
TPM, secure boot, memory protection, HSMs — use what hardware provides before adding software controls.
Assess each architecture type’s specific risks
Cloud, ICS, IoT, containers, microservices — each has a distinct attack surface and distinct mitigation set.
Select appropriate cryptographic solutions
AES-256 at rest, TLS 1.3 in transit, ECC for key exchange, PKI for identity. Plan for post-quantum migration now.
Design against known cryptanalytic attacks
Authenticated encryption, constant-time comparisons, protected key storage, hardware fault detection.
Apply physical security as a layer
Layered perimeters, mantrap entry, CPTED design, environmental controls, fire suppression, and power redundancy.
Integrate security across the full IS lifecycle
Requirements → Architecture → Development → Operations → Retirement. Security at every phase, including documented decommissioning.

By profession, a CloudSecurity Consultant; by passion, a storyteller. Through SunExplains, I explain security in simple, relatable terms — connecting technology, trust, and everyday life.
Leave a Reply