英语面试抱佛脚

Self-Introduction:

Hello, I’m Weiquan Huang, a third-year Software Engineering student at Shanghai Jiao Tong University. I’m more familar with C++ as my programming language, and I’ve built a strong foundation in computer systems through courses like Distributed Systems and Operating Systems. My key projects include developing the LSMKV Log-Structured Merge Tree, and the ChFS Distributed File System. I also participated in my university’s research program, designing an RDMA-based Adaptive Radix Tree Key-Value Store for high-performance computing. I’m passionate about computer systems and thrive in collaborative(v.协作) environments, so I’m excited to bring my skills to Morgan Stanley’s innovative(adj.创新的) projects. Thank you!

Pardon me. I’m afraid I missed your question. Would you mind saying it again?

Would you please give a moment to think about the problem?

When can I start?

What are you enjoy first during working for Morgan Stanley?

I’m new to my career and I want to make progress. Does your company have mentor programs?

Why are you join xxx?

Morgan Stanley is a leading global financial services firm with a strong reputation in investment banking, particularly in M&A and capital markets, as well as a robust wealth management division that serves high-net-worth clients
I’m particularly impressed by Morgan Stanley’s commitment to innovation in financial technology, which aligns with my interest in high performance computing.
I admire how Morgan Stanley has adapted to the evolving financial landscape, especially in sustainable investing, which is becoming increasingly important in the industry.

  1. Global Reputation: Morgan Stanley is a leading global financial services firm, known for its strong presence in investment banking, wealth management, and institutional securities. Working there offers exposure to high-profile clients and complex financial deals.
  2. Career Growth: The firm provides robust training programs, mentorship, and opportunities for advancement. It’s known for fostering talent and offering clear career paths, especially in competitive fields like finance.
  3. Diverse Opportunities: With divisions spanning investment banking, asset management, and trading, employees can explore varied roles and industries, gaining broad expertise.
  4. Innovative Culture: Morgan Stanley invests in technology and innovation, particularly in fintech and sustainable investing, which appeals to those interested in cutting-edge financial solutions.
  5. Strong Compensation: The firm offers competitive salaries, bonuses, and benefits, aligning with industry standards for top-tier finance roles.
  6. Workplace Culture: It emphasizes diversity, inclusion, and collaboration, creating a dynamic environment with global teams.

C++ Term

  1. diff of inline define
    1. define Processed during precompilation, it only performs simple replacements, cannot perform parameter validity checks, and cannot use C++ class member access control.
    2. inline An inline function suggests to the compiler whether to perform macro expansion, but the compiler has the right to refuse. It is a true function with strict parameter checking during calls; it can also be used as a member function of a class.
  2. what’s recursive: refers to a process or function that calls itself as a subroutine to solve a problem by breaking it into smaller, similar subproblems. In programming, a recursive function repeatedly invokes itself with modified arguments until it reaches a base case that stops the recursion.
  3. what’s stack overflow? A stack overflow happens when a program uses too much memory on the call stack, usually due to a recursive function calling itself too many times without stopping. The stack, a limited memory area for function calls, runs out of space, causing the program to crash.
  4. what’s virtual function? A virtual function in C++ is a member function in a base class that can be overridden by a derived class to provide specific behavior, enabling polymorphism. It allows a program to call the correct function implementation based on the actual object type at runtime, not the pointer or reference type.
  5. critical section? It refers to a part of a program where a process or thread accesses shared resources (e.g., variables, files) and must execute without interference from other processes or threads to avoid data corruption or inconsistencies.(i.e. code scope protected by lock, to avoid data race by multi thread access in the same time.)
  6. difference between process and threads?
    1. A process is an independent program in execution, with its own memory space, resources, and state. It’s like a running application (e.g., a web browser or a text editor).
    2. A thread is a smaller unit of execution within a process. Multiple threads within the same process share the same memory and resources but have their own execution path (like separate tasks within the same program).
  7. time complexity of linked list and array?
    1. linked list:
      1. Create a new node and assign the next pointer of the new node to the next of current head, and the next node of head points to the new node. O(1)
      2. If in certain index? O(k), iterate over the list until arrive to the position.
    2. array:
      1. move the elements with index smaller than the index of insertion to the next position, and then put the new element into the specific position. O(k)
  8. class and struct
    1. members are private / public by default in class / struct.
    2. inheritance for a class is private by default, public for a struct instead.
    3. a class can use templates, but a struct cannot.
  9. sizeof and strlen: sizeof refers to operator and strlen refers to function, for example, char a[30] = {'a', '\0'};, sizeof is 30, strlen is 1.
    1. sizeof is computed when compiling, while strlen will be computed at run time.
  10. difference of pointer and reference
    1. Pointer: A variable that stores the memory address of another variable. It can be changed to point to different variables or set to nullptr.
    2. Reference: An alias for an existing variable. Once initialized, it cannot be changed to refer to another variable and always refers to the same object.
Aspect Pointer (指针) Reference (引用)
Definition Stores the memory address of a variable. Acts as an alias for an existing variable.
Syntax Declared with * (e.g., int* ptr;). Uses * to dereference, & for address. Declared with & (e.g., int& ref = var;). No special operator to access value.
Initialization Can be uninitialized or set to nullptr. Must be initialized at declaration and cannot be changed to refer to another variable.
Reassignment Can be reassigned to point to a different variable. Cannot be reassigned; always refers to the same variable.
Nullability Can be nullptr, requiring null checks to avoid errors. Cannot be null; always refers to a valid variable.
Memory Address Explicitly stores and manipulates memory addresses. Does not store an address; the compiler handles the binding internally.
Size Size depends on the architecture (e.g., 8 bytes on a 64-bit system). Same size as the variable it refers to (no extra storage).
Use in Functions Pass-by-pointer allows modifying the original variable or passing nullptr. Pass-by-reference simplifies syntax for modifying the original variable.
Arithmetic Supports pointer arithmetic (e.g., ptr++ to move to the next memory location). No arithmetic operations; references are not addresses.
Safety Riskier due to possible dangling pointers(悬空指针) or null dereferencing. Safer, as it’s always bound to a valid variable and requires no dereferencing.
Arrays/Containers Commonly used for dynamic arrays or navigating data structures (e.g., linked lists). Cannot represent arrays directly; used for single variables or objects.
11. what’s object-oriented programming? a programming paradigm(范式) that organizes code around objects, which are instances of classes. A class defines the properties (data) and behaviors (functions) of objects, modeling real-world entities or abstract concepts. OOP emphasizes modularity, reusability, and maintainability through key principles.
1. Encapsulation(封装):**Bundling data (attributes) and methods (functions)** that operate on that data into a single unit (class), while **restricting access** to some components to protect the object’s integrity.
2. Inheritance
3. Polymorphism
  1. Constructor function and Destructor function:
    1. Constructor: A constructor is a special member function that is automatically called when an object of a class is created. It initializes the object’s data members and allocates resources (e.g., memory, file handles).
    2. Destructor: A destructor is a special member function that is automatically called when an object goes out of scope or is explicitly deleted (e.g., using delete). It cleans up resources (e.g., frees memory, closes files).
  2. type conversion / type casting
    1. implicit conversion: completed by compiler, only for default classes such as int, long, double etc.
    2. explicit conversion:
      1. static_cast allow basic type conversion, parent pointer to child pointer, or child pointer to parent pointer(may cause error.)
      2. dynamic_cast like static_cast, but it will forbid unsafe type conversion.
      3. const_cast convert the variable to const or non-const, but can not modify a const variable.
      4. reinterpret_cast interpret the bits according another type. for example, int float with the same bit will be interpreted differently.
  3. new and malloc
    1. new will automatically compute the space size of allocation by the type and number, while malloc should manually assign the allocation size.
    2. operator new may be inplemented by malloc or other allocation method(default: malloc)
    3. new returns the corresponding type of object, and malloc return void*
    4. space allocated by new will not be expanded, but malloc will be expanded by realloc
  4. RAII: local object to manage resources, if the object is destructed, it will release the resources
    1. std::lock_guard std::unique_lock try to get mutex lock when construction, and release it when destruction
    2. smart pointer
  5. std::mutex (互斥锁) is a synchronization primitive(同步原语) provided by the C++ Standard Library (in the <mutex> header) to manage access to shared resources in a multithreaded program. It ensures mutual exclusion(互斥锁), meaning only one thread can access a critical section (临界区) at a time, preventing race conditions and data corruption.
  6. Templates:
  • Generic programming mechanism for writing type-agnostic(adj.不可知的) code (e.g., std::vector).
  • Interview Focus: Function templates, class templates, SFINAE, and concepts (C++20).
  • Relevance: Common in performance-critical libraries, aligning with your RDMA/low-latency interests
  1. Condition Variable
  • std::condition_variable for thread synchronization, allowing threads to wait for conditions.
  • Interview Focus: wait(), notify_one(), notify_all(), and avoiding fake wakeups.

LeetCode

brute force method 暴力法
which has the time complexity: n squared n方复杂度
every combination of 2 values 两数组合
if they can sum up to our targeting value 和为
in this case 在这种情况下
index, indices 索引
declare a variable 声明一个变量
nested for loop 嵌套for循环
use this hash map to store key-value pairs 用哈希表存储键值对
iterate over this array 遍历这个数组
move on to the next element x 移动到
difference 差值
8 minus 6 8减6
iterative way and recursive way 迭代、递归
meanwhile 同时
singly/doubly linked list 链表
repeat over and over again
reaches the end of the linked list
key takeaway 关键重点
cycle 循环
in the normal case 一般情况下

LSMKV

The log-structured merge tree is a key-value storage project divided into a memory layer and a persistent layer.

  • The memory layer uses a Bloom filter and a skip list for insertion and deletion. The Bloom filter determines if a key is definitely not in the current memory layer, while the skip list enables search and insertion with O(log n) time complexity.
  • When the memory layer’s key-value pairs and Bloom filter exceed a certain size, the data is packaged into an SSTable with a timestamp and key range (min/max) and stored in the persistent layer’s level 0.
  • Each persistent layer has a size limit, typically 2^(n+1) SSTables. If the limit is exceeded, SSTables from the current layer that need to be moved down are merged with overlapping SSTables from the next layer to deduplicate keys.
  • Key-value storage is separated, with values stored in a VLog, and SSTable’s voffset is used to locate specific values.
  • Garbage collection involves checking earlier key-value pairs in the VLog, reinserting pairs with still-valid values, and using fallocate to punch holes at the file’s tail to reclaim physical disk space.
Bloom Filter

The Bloom filter uses multiple approximately linearly independent hash functions. During a lookup, these hash functions generate multiple indices; if any element in the indices is not 1, the key is considered absent from the memory layer. For insertion, each element at the indices obtained via hash functions is set to 1, followed by an actual insertion into the memory layer’s skip list.

Skip Table

A skip list is a data structure where searching for an element starts at the highest level. If no element smaller than the target is found at the current level, the search descends to the next level, continuing until the element is found or not found at the bottom level. For insertion, it first locates the appropriate insertion point, then probabilistically determines the number of levels to add, and finally connects adjacent nodes at each level.

Compaction

When merging multiple SSTables, all keys are sorted using a multi-way merge method, removing key-value pairs with identical keys but older timestamps. A new Bloom filter is generated, and finally, the filter and keys are written to the SSTable based on file size limits.

CHFS

CHFS is a distributed file system.

  • The single-node file system version adopts the inode file system architecture, using an inode table, inode bitmap, and block bitmap for block management. Each file or directory is marked by an inode block containing metadata and the block numbers of specific data blocks. When reading or writing a file or directory’s contents, the block numbers are iterated over from the inode block, and data is read from or written to those data blocks.

  • The distributed file system version includes a client server for sending file read/write requests, a metadata server for managing the directory structure and file metadata, and multiple data servers for handling the actual file data. Concurrency is controlled using Two-Phase Locking (2PL), and crash consistency is achieved through redo logs and checkpoints.

  • Two-Phase Locking (2PL) is a concurrency control mechanism ensuring before-or-after serializability in distributed systems. It has two phases: the growing phase, where a transaction acquires locks (shared or exclusive) without releasing any, and the shrinking phase, where locks are released without acquiring new ones. This prevents conflicts and ensures consistent data access.

  • Redo Log is a crash recovery technique that records changes (redo entries) made by transactions in a durable log before applying them to the database. If a crash occurs, the system replays the redo log to restore the database to a consistent state, ensuring no committed changes are lost.

  • Checkpoint is a mechanism to reduce recovery time after a crash. It periodically saves a consistent snapshot of the database state to disk, marking all transactions up to that point as complete. During recovery, the system starts from the latest checkpoint and applies redo log entries from there, minimizing the amount of log replay needed.

RDMA

The RDMA-based disaggregated adaptive radix tree key-value storage system consists of compute nodes and memory nodes, where compute nodes send requests to memory nodes to retrieve data for the next child node. The child node size of the adaptive radix tree dynamically scales based on the number of child nodes, such as 8, 16, 48, or 256. When searching for a key, the system first checks a express skiptable for a fast path to the key. If found, it jumps directly to the fast node and continues the search downward; otherwise, it traverses the tree layer by layer.

OS

  1. Process vs. Thread:

    • Process: Independent program in execution with its own memory space.
    • Thread: Lightweight unit of a process, sharing the same memory space with other threads in the process.
    • Interview Focus: Differences, context switching, multithreading benefits.
  2. Context Switching:

    • The process of saving and restoring a CPU’s state to switch between processes or threads.
    • Interview Focus: Overhead, impact on performance.
  3. Virtual Memory:

    • Abstraction of physical memory, allowing processes to use more memory than physically available via paging or segmentation.
    • Interview Focus: Page tables, TLB (Translation Lookaside Buffer), swapping.
  4. Paging and Page Fault:

    • Paging: Dividing memory into fixed-size pages for efficient allocation.
    • Page Fault: When a program accesses a page not in RAM, triggering OS to load it.
    • Interview Focus: Handling page faults, demand paging.
  5. Deadlock:

    • Situation where two or more processes are unable to proceed because each is waiting for the other to release a resource.
    • Interview Focus: Conditions (mutual exclusion, hold-and-wait, no preemption, circular wait), prevention, detection.
  6. Mutex vs. Semaphore:

    • Mutex: Locking mechanism for mutual exclusion, allowing one thread access to a resource.
    • Semaphore: Signaling mechanism controlling access to a shared resource with a counter.
    • Interview Focus: Use cases, binary vs. counting semaphores.
  7. Inter-Process Communication (IPC):

    • Methods for processes to communicate (e.g., pipes, message queues, shared memory).
    • Interview Focus: Trade-offs, performance implications.
  8. Scheduling Algorithms:

    • How the OS assigns CPU time to processes (e.g., FCFS, Round-Robin, Priority Scheduling).
    • Interview Focus: Starvation, fairness, real-time scheduling.
  9. File Systems:

    • Structures for storing and retrieving data (e.g., FAT, NTFS, ext4).
    • Interview Focus: Inodes, journaling, performance bottlenecks.
  10. Kernel vs. User Space:

    • Kernel Space: Privileged mode where OS core runs.
    • User Space: Restricted mode for applications.
    • Interview Focus: System calls, privilege separation.
  11. Concurrency and Parallelism:

    • Concurrency: Managing multiple tasks that can start, run, and complete in overlapping time periods.
    • Parallelism: Executing multiple tasks simultaneously.
    • Interview Focus: Implementation in OS, multicore systems.
  12. Signals:

    • Asynchronous notifications sent to a process to handle events (e.g., SIGKILL, SIGTERM).
    • Interview Focus: Handling, default actions.
  13. Zombie and Orphan Processes:

    • Zombie: Process that has completed but still has an entry in the process table.
    • Orphan: Process whose parent has terminated, adopted by init.
    • Interview Focus: Process management, cleanup.
  14. Memory Management:

    • How the OS allocates and deallocates memory (e.g., segmentation, paging, compaction).
    • Interview Focus: Fragmentation (internal/external), garbage collection.
  15. I/O Models:

    • Mechanisms for handling input/output (e.g., blocking, non-blocking, asynchronous I/O, epoll/select).
    • Interview Focus: Scalability in networked systems.

Interview Tips:

  • Expect questions on Linux/Unix systems, as they dominate internet company infrastructure.
  • Be ready for practical scenarios (e.g., “How would you debug a deadlock?” or “How does fork() work internally?”).
  • Know system calls like fork(), exec(), wait(), mmap(), and their use cases.
  • Be prepared to discuss trade-offs (e.g., performance vs. complexity in scheduling or memory management).

fork

  • returns child process number in parent process, and 0 for child process
  • Operating System allocates a new entry in the process table for the child process, stores PID, parent PID, process state, etc.
  • The child’s process control block(PCB) is created, copying the parent’s PCB. The PCB includes:
    • Program Counter(PC)
    • Stack pointer, register, and CPU state
    • Open file descriptors
    • Signal handlers and pending signals
  • Copy on Write strategy for memory duplication
  • The child process is added to the scheduler’s ready queue, making it eligible to run
  1. System Calls:
    • Interface between user-space applications and the kernel for services like file operations, process control, or networking (e.g., read(), write(), fork(), mmap()).
    • Interview Focus: Common system calls, their overhead, and how they differ from library functions. Expect questions like “What happens during a system call?” or “How does fork() interact with the kernel?” (relating to your earlier question on fork() internals).
    • Relevance: Our prior discussion on fork() and file-related calls like pread()/pwrite() highlights your interest in system call mechanics.
  2. Interrupts:
    • Signals to the CPU from hardware (e.g., I/O devices) or software, causing the OS to pause the current task and handle the event.
    • Interview Focus: Types (hardware, software, exceptions), interrupt handling, and latency. Questions may involve trade-offs in interrupt-driven vs. polling-based I/O.
  3. Thread Synchronization:
    • Mechanisms to coordinate threads accessing shared resources (e.g., locks, condition variables, atomic operations).
    • Interview Focus: std::condition_variable (from your April 2, 2025 conversation), spinlocks, reader-writer locks, and deadlock avoidance. Expect coding questions like implementing a thread-safe queue.
    • Relevance: Your question about std::condition_variable::wait() shows familiarity with synchronization primitives.
  4. Memory Mapping (mmap):
    • Maps files or devices into a process’s virtual memory, enabling direct access without explicit read/write calls.
    • Interview Focus: Use cases (e.g., file I/O, shared memory), mmap vs. read/write, and zero-copy I/O. Relates to your April 5, 2025 question about mmap for RDMA SSD access.
    • Relevance: Your interest in mmap for RDMA suggests you’re exploring high-performance I/O.
  5. Page Cache:
    • Kernel-managed cache of file data in RAM to speed up disk I/O.
    • Interview Focus: How it interacts with read()/write(), cache eviction, and performance implications. Ties to your March 20, 2025 question about page cache copying to user-space.
    • Relevance: You asked about data copying from SSD via page cache, showing interest in memory I/O.
  6. Fork and Exec:
    • fork() creates a child process; exec() replaces the process’s memory with a new program.
    • Interview Focus: Internals of fork() (as we discussed), exec() family (execl, execvp), and their use in process creation (e.g., in web servers). Questions like “What state is inherited after fork()?” are common.
    • Relevance: Your detailed question on fork() internals indicates you’re preparing for such topics.
  7. Signal Handling:
    • Managing asynchronous events (e.g., SIGINT, SIGKILL) via signal handlers or sigaction().
    • Interview Focus: Signal delivery, masking, and race conditions. Expect questions like “How do you safely handle signals in a multithreaded program?”
  8. Epoll/Select/Poll:
    • Mechanisms for monitoring multiple file descriptors for I/O events, critical for scalable network servers.
    • Interview Focus: epoll vs. select vs. poll, edge-triggered vs. level-triggered modes, and performance in high-concurrency systems like web servers.
  9. Asynchronous I/O (AIO):
    • Non-blocking I/O operations (e.g., aio_read, aio_write) that allow a process to continue while I/O completes.
    • Interview Focus: Use cases (databases, event-driven servers), libaio vs. io_uring, and trade-offs vs. synchronous I/O. Relates to your April 14, 2025 question on file I/O system calls.
  10. io_uring:
    • Modern Linux I/O interface for high-performance, asynchronous I/O with reduced system call overhead.
    • Interview Focus: How it improves on epoll or AIO, ring buffer mechanics, and use in frameworks like libuv. Common in interviews at companies like Cloudflare or Meta.
  11. Cgroups and Namespaces:
    • Cgroups: Control resource allocation (CPU, memory, I/O) for processes.
    • Namespaces: Isolate process environments (e.g., PID, network, mount) for containers.
    • Interview Focus: Containerization (Docker, Kubernetes), resource limits, and namespace types. Questions like “How do namespaces enable container isolation?” are common.
  12. Copy-on-Write (COW):
    • Optimization where memory is shared until a write occurs, used in fork() and memory management.
    • Interview Focus: Implementation details (as we discussed for fork()), page table updates, and performance benefits. Expect questions like “How does COW reduce fork() overhead?”
    • Relevance: Your fork() question covered COW, showing you’re engaging with memory optimization.
  13. RDMA (Remote Direct Memory Access):
    • Allows direct memory access between systems without CPU involvement, used in high-performance networking.
    • Interview Focus: Queue pairs, memory regions, and verbs like RDMA_READ/RDMA_WRITE. Your April 5 and April 11, 2025 questions on RDMA (e.g., queue pair creation, address translation) align with this.
    • Relevance: Your RDMA inquiries suggest you’re targeting roles involving low-latency systems.
  14. Futex (Fast User-space Mutex):
    • Low-level synchronization primitive for efficient locking in user space, with kernel fallback for contention.
    • Interview Focus: How futexes power pthread_mutex, performance vs. other locks, and use in high-concurrency systems.
  15. Scheduler and CFS (Completely Fair Scheduler):
    • Linux’s CFS assigns CPU time based on process “virtual runtime” for fairness.
    • Interview Focus: Scheduling policies (real-time, batch), nice values, and impact on system performance. Questions like “How does CFS handle CPU-bound vs. I/O-bound processes?” are common.
  16. Zero-Copy I/O:
    • Techniques (e.g., sendfile, splice, mmap) to avoid copying data between kernel and user space.
    • Interview Focus: Use cases (e.g., streaming, proxies), performance gains, and limitations. Ties to your March 20, 2025 mention of zero-copy with mmap.
  17. Swap Space:
    • Disk space used when RAM is full, storing less-active memory pages.
    • Interview Focus: Swapping vs. OOM (Out-Of-Memory) killer, performance impact, and tuning swappiness.
  18. Kernel Bypass:
    • Techniques (e.g., DPDK, RDMA) to avoid kernel overhead in networking or storage.
    • Interview Focus: Use in high-performance systems, trade-offs (e.g., bypassing kernel security), and RDMA’s role (relevant to your RDMA questions).
  19. NUMA (Non-Uniform Memory Access):
    • Memory architecture where access times vary by CPU and memory location.
    • Interview Focus: NUMA-aware programming, numactl, and optimizing for multi-socket servers.
  20. OOM Killer:
    • Kernel mechanism that terminates processes when memory is critically low.
    • Interview Focus: How it selects victims (oom_score), configuration, and avoiding OOM scenarios in production.

Interview-Relevant Notes

  • Linux Focus: Internet companies heavily use Linux, so terms like epoll, io_uring, cgroups, and RDMA are emphasized. Your questions on Linux system calls (pread/pwrite, RDMA) align with this.
  • Practical Scenarios: Expect questions like “Debug a high-CPU-usage issue” or “Optimize I/O for a web server.” Tools like strace, perf, and gdb may come up (relating to your April 5, 2025 troubleshooting interest).
  • Concurrency and Scalability: Terms like futex, epoll, and io_uring are critical for roles involving high-concurrency systems (e.g., at Google, Amazon, or ByteDance).
  • Distributed Systems Context: Cgroups, namespaces, and RDMA are relevant for containerized or low-latency systems, common in cloud infrastructure roles.

Tips for Preparation

  • Practice Coding: Implement a simple server using epoll or a thread pool with futexes.
  • System Design: Be ready to discuss OS concepts in system design (e.g., how mmap or RDMA fits into a database).
  • Debugging: Know how to use strace, lsof, or top to diagnose issues like deadlocks or memory leaks (building on your April 5, 2025 deadlock troubleshooting question).
  • Relate to Role: For backend roles, focus on epoll, io_uring, and synchronization; for infrastructure, emphasize cgroups, namespaces, and RDMA.

Computer Network

Computer Networking Terms for Internet Company Interviews

1. Core Networking Concepts

  1. OSI Model:

    • A 7-layer framework (Physical, Data Link, Network, Transport, Session, Presentation, Application) for understanding network protocols.
    • Interview Focus: Layer responsibilities, mapping protocols (e.g., TCP to Transport layer), and differences from TCP/IP model.
  2. TCP/IP Model:

    • A 4-layer model (Link, Internet, Transport, Application) used in practice for internet protocols.
    • Interview Focus: How it maps to OSI, protocol examples (e.g., IP at Internet layer, TCP/UDP at Transport).
  3. IP (Internet Protocol):

    • Handles addressing and routing of packets (IPv4, IPv6).
    • Interview Focus: IP addressing, subnetting, CIDR, and IPv4 vs. IPv6 differences.
  4. TCP (Transmission Control Protocol):

    • Reliable, connection-oriented protocol ensuring ordered delivery and error correction.
    • Interview Focus: 3-way handshake, congestion control, flow control, and TCP state machine (e.g., SYN, ACK, FIN).
  5. UDP (User Datagram Protocol):

    • Connectionless, lightweight protocol for low-latency applications (e.g., DNS, streaming).
    • Interview Focus: TCP vs. UDP trade-offs, use cases, and handling packet loss.
  6. DNS (Domain Name System):

    • Resolves domain names to IP addresses.
    • Interview Focus: DNS resolution process, caching, recursive vs. iterative queries, and tools like dig.
  7. HTTP/HTTPS:

    • Protocols for web communication, with HTTPS adding TLS/SSL for security.
    • Interview Focus: Status codes (e.g., 200, 404, 503), REST APIs, and TLS handshake.
  8. TLS/SSL (Transport Layer Security/Secure Sockets Layer):

    • Secures data transmission via encryption and authentication.(Symmetric Encryption: Shared key; Asymmetric Encryption: RSA, private key and public key)
    • Interview Focus: Handshake process, certificates, symmetric vs. asymmetric encryption.
      (Server send client a challeage, client send its private key with the challeage, then server uses auth_key(public key of user) to check and verify the identity of client.)

2. Network Performance and Scalability

  1. Load Balancing:

    • Distributes network traffic across multiple servers (e.g., round-robin, least connections).
    • Interview Focus: Layer 4 vs. Layer 7 load balancing, algorithms, and tools like NGINX or AWS ELB.
  2. CDN (Content Delivery Network):

    • Distributed servers caching content closer to users to reduce latency.
    • Interview Focus: How CDNs work, cache invalidation, and companies like Akamai or Cloudflare.
  3. Congestion Control:(拥塞控制)

    • Mechanisms (e.g., TCP’s AIMD, Reno, Cubic) to prevent network overload.
    • Interview Focus: TCP congestion algorithms, packet loss handling, and performance tuning.
  4. RDMA (Remote Direct Memory Access):

    • Allows direct memory access between systems over a network, bypassing CPU and OS.
    • Interview Focus: Queue pairs, verbs (e.g., RDMA_READ, RDMA_WRITE), InfiniBand/RoCE, and use cases (e.g., databases, HPC). Your questions on RDMA queue pairs and address translation (April 2025) align with this.
    • Relevance: RDMA intersects with OS concepts like mmap (from your April 5, 2025 question) for zero-copy I/O.
  5. Zero-Copy Networking:

    • Techniques to avoid copying data between kernel and user space (e.g., sendfile, RDMA).
    • Interview Focus: Performance benefits, integration with OS (relates to your March 20, 2025 zero-copy question), and use in high-throughput systems.

3. Network Programming and OS Integration

  1. Sockets:

    • Endpoints for network communication (e.g., TCP, UDP sockets).
    • Interview Focus: Socket APIs (socket(), bind(), connect(), accept()), non-blocking sockets, and coding questions like building a TCP server.
  2. Epoll/Select/Poll:

    • OS mechanisms for monitoring multiple sockets for I/O events.
    • Interview Focus: Scalability of epoll vs. select, edge vs. level triggering, and use in event-driven servers (e.g., Node.js, Nginx).
    • Relevance: Discussed in your OS terms request as critical for networked systems.
  3. io_uring:

    • Linux’s high-performance I/O interface for asynchronous networking and file operations.
    • Interview Focus: How it reduces system call overhead, ring buffer mechanics, and use in modern servers (e.g., Redis, ScyllaDB).
    • Relevance: Mentioned in your OS terms request for scalable I/O.
  4. Kernel Bypass:

    • Techniques like DPDK or RDMA to bypass kernel networking stack for low-latency.
    • Interview Focus: Trade-offs (e.g., losing kernel features like TCP), RDMA’s role, and use in high-frequency trading or cloud providers.
    • Relevance: Your RDMA questions suggest interest in kernel bypass.

4. Network Security and Reliability

  1. Firewall:

    • Filters network traffic based on rules (e.g., iptables, nftables).
    • Interview Focus: Rule configuration, stateful vs. stateless firewalls, and debugging blocked connections.
  2. NAT (Network Address Translation):

    • Maps private IP addresses to public ones for internet access.
    • Interview Focus: NAT tables, port forwarding, and impact on network design.
  3. DDoS (Distributed Denial of Service):

    • Attack overwhelming a server with traffic.
    • Interview Focus: Mitigation (e.g., rate limiting, CDN, BGP routing), and tools like Cloudflare or AWS Shield.
  4. Packet Loss and Retransmission:

    • Loss of packets due to congestion or errors, handled by TCP retransmission.
    • Interview Focus: Impact on performance, TCP’s retransmission timeout (RTO), and diagnosing packet loss (e.g., tcpdump).

5. Advanced and Distributed Systems

  1. BGP (Border Gateway Protocol):

    • Routing protocol for exchanging routes between autonomous systems.
    • Interview Focus: Path selection, route flapping, and use in large-scale networks (e.g., ISPs, cloud providers).
  2. QUIC:

    • UDP-based protocol (used in HTTP/3) for low-latency, multiplexed connections.
    • Interview Focus: Advantages over TCP (e.g., faster handshake), congestion control, and adoption challenges.
  3. gRPC:

    • High-performance RPC framework using HTTP/2 and Protocol Buffers.
    • Interview Focus: Use cases (microservices), performance vs. REST, and integration with load balancers.
  4. VPC (Virtual Private Cloud):

    • Isolated network environment in cloud providers (e.g., AWS, GCP).
    • Interview Focus: Subnets, security groups, and routing tables in cloud networking.

Interview-Relevant Notes

  • Linux Networking: Terms like epoll, io_uring, and RDMA are critical, as internet companies rely on Linux for servers. Your OS questions (e.g., fork(), mmap) show you’re engaging with Linux system calls, which pair well with socket programming and RDMA.
  • Practical Scenarios: Expect coding questions (e.g., “Write a TCP client/server”) or debugging tasks (e.g., “Diagnose a dropped connection using tcpdump or netstat”). Your April 5, 2025 interest in troubleshooting aligns with this.
  • Scalability and Performance: Terms like load balancing, CDNs, QUIC, and RDMA are emphasized for roles at companies like Google, Meta, or ByteDance, where low-latency systems are key.
  • RDMA Relevance: Your RDMA questions (e.g., queue pairs, address translation) suggest you’re targeting roles involving high-performance networking, making terms like kernel bypass and zero-copy critical.
  • Distributed Systems: VPC, gRPC, and BGP are relevant for cloud or microservices roles, often paired with OS concepts like cgroups/namespaces (from your April 18, 2025 OS terms request).

Tips for Preparation

  • Coding Practice: Implement a simple TCP/UDP server using sockets and epoll, or experiment with RDMA verbs (e.g., ibv_post_send).
  • Tools: Learn tcpdump, wireshark, netstat, and nslookup for debugging (relates to your April 5, 2025 troubleshooting interest).
  • System Design: Be ready to discuss networking in system design (e.g., “Design a CDN” or “Scale a web server with load balancing”).
  • OS-Network Overlap: Study how OS concepts (e.g., mmap, fork(), io_uring) interact with networking (e.g., zero-copy, socket buffers), leveraging your OS knowledge.

Below is a concise list of key knowledge areas about TCP (Transmission Control Protocol) in computer networking, tailored for internet company interviews (e.g., for backend development, systems programming, or infrastructure roles). Given your prior questions about operating systems (e.g., fork(), mmap) and networking (e.g., RDMA, epoll), I’ve emphasized TCP details relevant to high-performance systems, scalability, and OS integration, while connecting to your interests where applicable (e.g., RDMA’s contrast to TCP, socket programming). These concepts are frequently tested in interviews due to TCP’s critical role in reliable network communication.

Key Knowledge Areas About TCP

1. Core Characteristics

  • Reliable Delivery: Ensures packets are delivered without errors, in order, and retransmits lost packets.
  • Connection-Oriented: Establishes a connection before data transfer (via handshake) and closes it afterward.
  • Stream-Based: Treats data as a continuous stream, not discrete packets, with no message boundaries.
  • Interview Focus: Contrast with UDP (connectionless, unreliable), use cases (e.g., HTTP, email vs. DNS, streaming).
  • Relevance: Your networking terms request (April 18, 2025) included TCP vs. UDP, showing interest in protocol trade-offs.

2. Three-Way Handshake

  • Process to establish a TCP connection:
    1. SYN: Client sends a segment with SYN flag and initial sequence number (ISN).
    2. SYN-ACK: Server responds with SYN, ACK flags, its ISN, and acknowledges client’s ISN.
    3. ACK: Client acknowledges server’s ISN.
  • Interview Focus: Steps, sequence numbers, and handling failures (e.g., lost SYN). Questions like “Draw the handshake” or “What happens if SYN-ACK is lost?” are common.
  • Relevance: Ties to socket programming (connect(), accept()), which aligns with your OS focus (e.g., epoll).

3. Connection Termination (Four-Way Close)

  • Process to close a TCP connection:
    • Each side sends a FIN and receives an ACK, requiring four steps (FIN → ACK, FIN → ACK).
    • Can be initiated by either side; supports half-closed states (e.g., one side still sending).
  • Interview Focus: FIN vs. RST (reset), TIME_WAIT state, and resource cleanup. Expect questions like “Why does TIME_WAIT exist?” or “How does RST affect a connection?”
  • Relevance: Relates to socket cleanup in high-concurrency servers, relevant to your epoll interest.

4. Sequence and Acknowledgment Numbers

  • Sequence Numbers: Track the order of bytes in the data stream, starting with an ISN during handshake.
  • Acknowledgment Numbers: Indicate the next expected byte the receiver wants, confirming receipt of prior bytes.
  • Interview Focus: How they ensure ordered delivery, handling out-of-order packets, and calculating segment sizes.
  • Relevance: Critical for debugging network issues (e.g., with tcpdump), aligning with your April 5, 2025 troubleshooting interest.

5. Flow Control

  • Uses a sliding window to manage the amount of data sent before receiving ACKs, preventing receiver buffer overflow.
  • Receive Window (rwnd): Advertised by the receiver in TCP headers to indicate available buffer space.
  • Interview Focus: Window scaling option (for large bandwidth-delay products), zero-window scenarios, and performance tuning.

6. Congestion Control

  • Manages network congestion to avoid overwhelming the network.
  • Key algorithms:
    • Slow Start: Exponentially increases sending rate until a threshold (ssthresh).
    • Congestion Avoidance: Linearly increases rate after ssthresh.
    • Fast Retransmit: Retransmits lost packets after three duplicate ACKs.
    • Fast Recovery: Adjusts window size without restarting slow start.
  • Common implementations: TCP Reno, Cubic, BBR.
  • Interview Focus: Algorithm differences, handling packet loss, and tuning for high-latency networks (e.g., transcontinental links).

7. TCP State Machine

  • Describes the states a TCP connection transitions through (e.g., LISTEN, SYN_SENT, ESTABLISHED, FIN_WAIT, TIME_WAIT).
  • Interview Focus: State transitions, purpose of each state, and handling edge cases (e.g., simultaneous open/close). Questions like “What states are involved in a handshake?” or “Why is TIME_WAIT 2*MSL (Maximum Segment Lifetime)?” are common.
  • Why is TIME_WAIT 2MSL? MSL is the packet’s maximum lifetime, Linux consider that a packet will bypass 64 routes within 30 seconds, so from sending message to get ACK, it takes at most twice time of MSL, so the TIME_WAIT is 2 MSL.

8. TCP Header

  • Contains fields like:
    • Source/Destination Ports
    • Sequence/Acknowledgment Numbers
    • Flags (SYN, ACK, FIN, RST, PSH, URG)
    • Window Size
    • Checksum
    • Options (e.g., MSS, window scaling, timestamps)
  • Interview Focus: Header structure, role of flags, and options for performance (e.g., MSS negotiation). Expect questions like “What’s the maximum TCP header size?” or “How does window scaling work?”

9. Retransmission and Timeout

  • TCP retransmits lost packets based on:
    • Retransmission Timeout (RTO): Calculated using RTT (Round-Trip Time) estimates.
    • Fast Retransmit: Triggered by three duplicate ACKs.
  • Interview Focus: RTO calculation, exponential backoff, and impact on latency. Questions like “How does TCP detect packet loss?” are common.

10. Performance Optimizations

  • Nagle’s Algorithm: Buffers small data chunks to reduce packet overhead, disabled for low-latency apps (e.g., via TCP_NODELAY).
  • Delayed ACK: Delays ACKs to piggyback with data, tunable for performance.
  • Window Scaling: Extends window size for high-bandwidth, high-latency networks.
  • Selective ACK (SACK): Allows retransmission of specific lost segments.
  • Interview Focus: Trade-offs (e.g., latency vs. throughput), socket options, and tuning for web servers.

11. Socket Programming Integration

  • TCP is implemented via socket APIs in OS (e.g., socket(), bind(), listen(), accept(), connect()).
  • Key OS interactions:
    • Non-blocking I/O: Using epoll/select for scalability (from your April 18, 2025 OS terms request).
    • Buffer Management: Kernel socket buffers (SO_SNDBUF, SO_RCVBUF) affect performance.
  • Interview Focus: Coding a TCP server/client, handling multiple connections, and debugging socket errors (e.g., ECONNREFUSED).

12. TCP vs. Alternatives

  • Compared to:
    • UDP: Faster but unreliable, used for DNS, streaming.
    • QUIC: UDP-based, low-latency protocol (HTTP/3) with TCP-like reliability.
    • RDMA: Hardware-based, bypasses TCP stack for ultra-low latency (your April 2025 RDMA focus).
  • Interview Focus: When to use TCP vs. alternatives, and TCP’s overhead (e.g., handshake, congestion control).

13. Common Issues and Debugging

  • Issues:

    • Connection timeouts (e.g., firewall blocking SYN).
    • High latency due to congestion or retransmissions.
    • Resource exhaustion (e.g., too many open sockets in TIME_WAIT).
  • Tools: tcpdump, wireshark, netstat, ss for packet analysis and socket state.

  • Interview Focus: Diagnosing issues like “Why is a connection stuck in SYN_SENT?” or “How do you reduce TIME_WAIT sockets?”

  • Linux Focus: TCP is deeply integrated with Linux networking (e.g., socket APIs, epoll, io_uring), critical for internet companies. Your OS questions (e.g., fork(), epoll) align with TCP’s OS dependencies.

  • Coding Questions: Expect tasks like “Write a TCP echo server” or “Handle 10,000 concurrent connections,” leveraging epoll or io_uring (from your April 18, 2025 OS terms).

  • System Design: TCP knowledge is tested in designing scalable systems (e.g., “How does TCP impact web server performance?” or “Optimize a proxy for high throughput”).

  • RDMA Contrast: Your RDMA questions (April 2025) suggest interest in high-performance networking; TCP’s software-based reliability contrasts with RDMA’s hardware approach, a common interview comparison.

  • Practical Scenarios: Be ready to debug TCP issues (e.g., using tcpdump to analyze retransmissions) or tune performance (e.g., adjusting TCP_NODELAY), tying to your troubleshooting focus.

SpringBoot

Below is a list of key terms and concepts related to Spring Boot backend development, tailored for internet company interviews, particularly for roles involving Java-based backend systems. Given your prior questions about operating systems (e.g., fork(), mmap, epoll) and networking (e.g., TCP, RDMA), I’ve emphasized Spring Boot terms that intersect with these areas where relevant, such as REST APIs, performance optimization, and integration with networked systems. These terms are frequently discussed in interviews at tech companies (e.g., for building scalable microservices or APIs) and cover core Spring Boot features, architecture, and ecosystem components.

Key Terms in Spring Boot Backend Development

1. Core Spring Boot Concepts

  1. Spring Boot:

    • A framework built on top of Spring to simplify Java backend development with auto-configuration, embedded servers, and minimal boilerplate.
    • Interview Focus: Auto-configuration, starters, and differences from Spring MVC.
  2. Starters:

    • Pre-configured dependency sets (e.g., spring-boot-starter-web, spring-boot-starter-data-jpa) to quickly include libraries.
    • Interview Focus: Common starters, dependency management, and customizing starter configurations.
  3. Auto-Configuration:

    • Spring Boot’s mechanism to automatically configure beans based on classpath dependencies and properties.
    • Interview Focus: How it works (e.g., @Conditional annotations), disabling auto-config, and debugging configuration issues.
  4. Application Properties/YAML:

    • Configuration files (application.properties or application.yml) for customizing Spring Boot settings (e.g., server port, database URL).
    • Interview Focus: Profiles (e.g., application-dev.yml), externalized configuration, and property precedence.
  5. Embedded Server:

    • Built-in servers (e.g., Tomcat, Jetty, Undertow) for running Spring Boot apps without external deployment.
    • Interview Focus: Configuring server properties, switching servers, and performance tuning (e.g., thread pools).
    • Relevance: Ties to networking (e.g., TCP, epoll from your April 18, 2025 questions) for handling HTTP connections.

2. Web Development and APIs

  1. REST API:

    • Representational State Transfer APIs built with Spring Boot’s @RestController for stateless, HTTP-based communication.
    • Interview Focus: REST principles, HTTP methods (GET, POST, PUT, DELETE), and status codes (e.g., 200, 404).
    • Relevance: Relates to HTTP/HTTPS and TCP (from your April 18, 2025 networking and TCP questions) for client-server communication.
  2. @RestController:

    • Annotation combining @Controller and @ResponseBody to handle HTTP requests and return JSON/XML.
    • Interview Focus: Request mapping (@GetMapping, @PostMapping), path variables, and request body validation.
  3. Spring MVC:

    • The web framework underlying Spring Boot’s web layer, handling HTTP requests and responses.
    • Interview Focus: MVC pattern (Model, View, Controller), request lifecycle, and dispatcher servlet.
  4. OpenAPI/Swagger:

    • Tools for documenting and testing REST APIs, integrated via springdoc-openapi or springfox.
    • Interview Focus: Generating API docs, annotating endpoints, and client generation.
  5. HTTP Client (RestTemplate/WebClient):

    • RestTemplate: Synchronous HTTP client for calling external APIs.
    • WebClient: Reactive, non-blocking client for modern apps.
    • Interview Focus: Synchronous vs. reactive clients, handling timeouts, and retries.
    • Relevance: Ties to TCP and non-blocking I/O (e.g., epoll from your OS terms) for scalable API calls.

3. Data Access and Persistence

  1. Spring Data JPA:

    • Simplifies database access with JPA (Java Persistence API) and Hibernate, using repositories for CRUD operations.
    • Interview Focus: @Entity, @Repository, query methods, and JPQL/custom queries.
  2. Hibernate:

    • ORM (Object-Relational Mapping) framework used by Spring Data JPA to map Java objects to database tables.
    • Interview Focus: Lazy vs. eager loading, N+1 query problem, and caching (e.g., second-level cache).
  3. Database Transactions:

    • Managed via @Transactional to ensure data consistency (ACID properties).
    • Interview Focus: Transaction propagation (e.g., REQUIRED, REQUIRES_NEW), isolation levels, and rollback handling.
  4. Spring Data REST:

    • Automatically exposes JPA repositories as REST APIs.
    • Interview Focus: HATEOAS (Hypermedia as the Engine of Application State), customizing endpoints, and security.

4. Microservices and Distributed Systems

  1. Spring Cloud:

    • Suite of tools for building microservices (e.g., service discovery, config server, circuit breakers).
    • Interview Focus: Eureka (service discovery), Spring Cloud Config, and load balancing.
    • Relevance: Relates to networking concepts like load balancing and VPC (from your April 18, 2025 networking terms).
  2. Circuit Breaker:

    • Pattern (e.g., via Resilience4j or Hystrix) to handle failures in microservices by preventing cascading failures.
    • Interview Focus: Configuration, fallback methods, and monitoring.
  3. Service Discovery:

    • Mechanism (e.g., Eureka, Consul) for microservices to dynamically locate each other.
    • Interview Focus: Client-side vs. server-side discovery, health checks, and DNS integration.
  4. API Gateway:

    • Entry point for microservices (e.g., Spring Cloud Gateway) handling routing, authentication, and rate limiting.
    • Interview Focus: Filters, predicates, and integration with load balancers.
    • Relevance: Ties to networking (e.g., load balancing, TCP) for traffic management.

5. Security

  1. Spring Security:

    • Framework for securing Spring Boot apps with authentication and authorization.
    • Interview Focus: OAuth2, JWT, @PreAuthorize, and securing REST endpoints.
    • Relevance: Relates to TLS/SSL (from your networking terms) for secure communication.
  2. OAuth2:

    • Protocol for token-based authentication, integrated with Spring Security.
    • Interview Focus: Authorization code flow, refresh tokens, and resource server setup.

6. Performance and Scalability

  1. Actuator:

    • Endpoints for monitoring and managing Spring Boot apps (e.g., /actuator/health, /actuator/metrics).
    • Interview Focus: Custom endpoints, integration with Prometheus/Grafana, and health checks.
  2. Caching:

    • Improves performance by storing frequently accessed data (e.g., via @Cacheable, Ehcache, Redis).
    • Interview Focus: Cache eviction, cache providers, and distributed caching.
    • Relevance: Ties to OS memory management (e.g., page cache, mmap from your April 18, 2025 OS terms).
  3. Asynchronous Processing:

    • Uses @Async and TaskExecutor for non-blocking tasks (e.g., sending emails).
    • Interview Focus: Thread pools, @EnableAsync, and handling async failures.
    • Relevance: Relates to OS threading and synchronization (e.g., std::condition_variable from your April 2, 2025 question).
  4. Reactive Programming (Spring WebFlux):

    • Non-blocking, event-driven model using Project Reactor for high-concurrency apps.
    • Interview Focus: Mono/Flux, reactive repositories, and WebFlux vs. MVC.
    • Relevance: Aligns with non-blocking I/O like epoll or io_uring (from your OS terms).

7. Testing

  1. Spring Boot Test:

    • Framework for unit and integration testing with @SpringBootTest, @WebMvcTest, and @DataJpaTest.
    • Interview Focus: Mocking (Mockito), testing REST APIs, and database testing.
  2. MockMvc:

    • Utility for testing Spring MVC controllers without starting a server.
    • Interview Focus: Simulating HTTP requests, verifying responses, and chaining assertions.

8. Deployment and DevOps

  1. Spring Boot Maven/Gradle Plugin:

    • Tools for building and packaging Spring Boot apps (e.g., spring-boot:run, executable JARs).
    • Interview Focus: Dependency management, fat JARs, and multi-module projects.
  2. Docker Integration:

    • Containerizing Spring Boot apps for deployment in Kubernetes or cloud platforms.
    • Interview Focus: Dockerfile setup, multi-stage builds, and environment variables.
    • Relevance: Relates to OS concepts like cgroups and namespaces (from your April 18, 2025 OS terms).
  3. Cloud-Native:

    • Running Spring Boot apps on platforms like AWS, GCP, or Azure with Spring Cloud.
    • Interview Focus: 12-factor app principles, auto-scaling, and service meshes.

9. Messaging and Event-Driven Systems

  1. Spring Kafka:

    • Integrates Apache Kafka for event-driven microservices.
    • Interview Focus: Producers, consumers, partitions, and error handling.
  2. Spring AMQP:

    • Supports RabbitMQ for message queues.
    • Interview Focus: Queues, exchanges, and message retry mechanisms.

Interview-Relevant Notes

  • Java and Linux Focus: Spring Boot runs on Java, often on Linux servers, so familiarity with OS concepts (e.g., epoll for HTTP handling, mmap for caching) and networking (e.g., TCP for REST APIs) is a plus. Your prior OS and networking questions align well.
  • REST and Microservices: Expect coding questions like “Build a REST API with Spring Boot” or system design tasks like “Design a microservices architecture with Spring Cloud.” Your TCP knowledge (April 18, 2025) supports REST API development.
  • Performance and Scalability: Terms like WebFlux, caching, and Actuator are critical for roles at companies like Amazon, ByteDance, or Netflix, where scalability is key. Your RDMA/zero-copy interests (April 2025) suggest a focus on performance.
  • Practical Scenarios: Be ready to debug issues (e.g., “Why is my API slow?” using Actuator) or secure endpoints (e.g., with Spring Security). Your troubleshooting interest (April 5, 2025) aligns with this.
  • OS-Networking Overlap: Spring Boot’s embedded servers use TCP sockets and OS I/O (e.g., epoll), and caching ties to OS memory management, leveraging your OS knowledge.

Tips for Preparation

  • Coding Practice: Build a REST API with Spring Boot, integrating JPA, Security, and Actuator. Practice with starters like spring-boot-starter-web and spring-boot-starter-data-jpa.
  • System Design: Study microservices patterns (e.g., circuit breakers, API gateways) and design a system using Spring Cloud components.
  • Testing: Write tests with @SpringBootTest and MockMvc to simulate API calls, preparing for coding interviews.
  • Performance Tuning: Experiment with @Async, WebFlux, and caching (e.g., Redis) to optimize APIs, aligning with your RDMA/zero-copy focus.
  • Networking Integration: Understand how Spring Boot handles HTTP over TCP (e.g., connection pooling, timeouts) and secure it with TLS, leveraging your TCP knowledge.

If you want a deeper dive into any term (e.g., building a REST API, configuring Spring Security, or integrating Kafka), practice interview questions, or connections to OS/networking concepts, let me know!

Key Terms in Docker

Docker与虚拟机的不同

  • 虚拟机需要在硬件上运行完整的操作系统,是独立的操作系统,网络、进程管理等自己独立
  • Docker运行在主机的操作系统上,使用主机内核,而且事实上即使是ubuntu容器,也只是提供了linux的根文件系统,一些软件等;事实上的核心功能比如文件系统、进程管理、网络由宿主机提供,内核也是使用的宿主机内核

1. Core Docker Concepts

  1. Container:

    • A lightweight, isolated environment running an application and its dependencies, created from a Docker image.
    • Interview Focus: Containers vs. VMs, isolation mechanisms (namespaces, cgroups), and use cases (e.g., microservices).
    • Relevance: Your March 28, 2025 question about installing Rust in a Docker container shows familiarity with container usage.
  2. Docker Image:

    • A read-only template (e.g., nginx:latest) used to create containers, containing application code, libraries, and dependencies.
    • Interview Focus: Image layers, Dockerfile, and optimizing image size (e.g., multi-stage builds).
  3. Dockerfile:

    • A script with instructions (e.g., FROM, RUN, COPY) to build a Docker image.
    • Interview Focus: Best practices (e.g., minimizing layers, using .dockerignore), and multi-stage builds for Spring Boot apps (from your April 18, 2025 Spring Boot terms).
    • Relevance: Your Rust Docker question (March 28, 2025) involved Dockerfile setup.
  4. Docker Registry:

    • A repository for storing and distributing Docker images (e.g., Docker Hub, AWS ECR).
    • Interview Focus: Pushing/pulling images, private registries, and authentication.
  5. Docker Engine:

    • The runtime that builds and runs containers, including the Docker daemon (dockerd) and CLI (docker).
    • Interview Focus: Daemon architecture, client-server communication, and troubleshooting (e.g., docker logs).

2. Container Management

  1. Docker Compose:

    • A tool for defining and running multi-container applications using YAML files (e.g., docker-compose.yml).
    • Interview Focus: Services, networks, volumes, and scaling multi-container apps (e.g., Spring Boot + database).
    • Relevance: Complements your Spring Boot microservices interest (April 18, 2025).
  2. Volumes:

    • Persistent storage for containers, mounted to preserve data across container restarts.
    • Interview Focus: Volumes vs. bind mounts, managing data (e.g., for databases), and backup strategies.
    • Relevance: Ties to your March 24, 2025 question about disk space management on Linux servers.
  3. Docker Network:

    • Defines how containers communicate (e.g., bridge, host, overlay networks).
    • Interview Focus: Network modes, port mapping, and container-to-container communication.
    • Relevance: Relates to your TCP and networking knowledge (April 18, 2025), as containers use TCP sockets.

3. Runtime and Isolation

  1. Namespaces:

    • Linux kernel feature isolating resources (e.g., PID, network, mount) for container isolation.
    • Interview Focus: Types of namespaces, how they enable isolation, and debugging namespace issues.
    • Relevance: Discussed in your April 18, 2025 OS terms request, critical for Docker’s architecture.
  2. Cgroups (Control Groups):

    • Linux mechanism to limit resources (e.g., CPU, memory) for containers.
    • Interview Focus: Resource allocation, configuring limits, and performance tuning.
    • Relevance: Also from your April 18, 2025 OS terms, relevant to container resource management.
  3. Containerd:

    • A container runtime used by Docker to manage container lifecycles, abstracted from the Docker daemon.
    • Interview Focus: Containerd vs. Docker runtime, CRI (Container Runtime Interface), and its role in K8s.

4. Security and Optimization

  1. Docker Security:

    • Practices like running containers as non-root, using USER in Dockerfile, and scanning images for vulnerabilities.
    • Interview Focus: Securing images, AppArmor/SELinux, and mitigating privilege escalation.
    • Relevance: Aligns with your Spring Boot security terms (e.g., Spring Security, TLS) from April 18, 2025.
  2. Multi-Stage Builds:

    • A Dockerfile technique to reduce image size by separating build and runtime environments.
    • Interview Focus: Writing multi-stage Dockerfiles for Java/Spring Boot apps, optimizing CI/CD pipelines.

Key Terms in Kubernetes (K8s)

1. Core K8s Components

  1. Pod:

    • The smallest deployable unit in K8s, containing one or more containers sharing network and storage.
    • Interview Focus: Pod lifecycle, multi-container pods, and init containers.
    • Relevance: Pods often run Docker containers, tying to your Docker knowledge.
  2. Node:

    • A worker machine (physical or virtual) in a K8s cluster, running pods.
    • Interview Focus: Node components (kubelet, kube-proxy), taints/taints, and node affinity.
  3. Cluster:

    • A set of nodes (control plane + worker nodes) managed by K8s to run containerized workloads.
    • Interview Focus: Control plane components (API server, etcd, scheduler, controller manager), and high availability.
  4. Kubelet:

    • Agent running on each node, communicating with the control plane to manage pods.
    • Interview Focus: Pod lifecycle management, health checks, and troubleshooting kubelet failures.
  5. Kube-Proxy:

    • Runs on each node to manage network rules for pod communication (e.g., via iptables, IPVS).
    • Interview Focus: Service discovery, load balancing, and network policies.
    • Relevance: Relates to your networking terms (e.g., load balancing, TCP) from April 18, 2025.

2. Workload Management

  1. Deployment:

    • A K8s resource to manage stateless applications, ensuring desired pod replicas and rolling updates.
    • Interview Focus: Rolling vs. blue-green deployments, rollback strategies, and scaling.
    • Relevance: Used for Spring Boot microservices (from your April 18, 2025 Spring Boot terms).
  2. StatefulSet:

    • Manages stateful applications (e.g., databases) with stable pod identities and persistent storage.
    • Interview Focus: Differences from Deployment, use cases (e.g., MySQL, MongoDB), and PVCs.
  3. DaemonSet:

    • Ensures a pod runs on every node (e.g., for logging, monitoring agents).
    • Interview Focus: Use cases (e.g., Fluentd, Prometheus), and node selectors.
  4. Job/CronJob:

    • Job: Runs a task to completion (e.g., batch processing).
    • CronJob: Schedules Jobs to run periodically.
    • Interview Focus: Configuring parallelism, retries, and scheduling syntax.

3. Networking and Service Discovery

  1. Service:

    • An abstraction to expose pods via a stable IP or DNS name (e.g., ClusterIP, NodePort, LoadBalancer).
    • Interview Focus: Service types, DNS resolution, and headless services.
    • Relevance: Ties to your TCP and DNS knowledge (April 18, 2025) for pod communication.
  2. Ingress:

    • A K8s resource to manage external HTTP/HTTPS traffic, routing to services based on rules.
    • Interview Focus: Ingress controllers (e.g., NGINX, Traefik), path-based routing, and TLS.
    • Relevance: Aligns with your Spring Boot REST APIs and TLS (April 18, 2025).
  3. Network Policy:

    • Defines rules for pod-to-pod communication, enforcing security via ingress/egress policies.
    • Interview Focus: Writing policies, Calico/Cilium integration, and debugging connectivity.
    • Relevance: Relates to your firewall and NAT networking terms (April 18, 2025).

4. Storage and Configuration

  1. Persistent Volume (PV)/Persistent Volume Claim (PVC):

    • PV: Cluster-wide storage resource (e.g., NFS, EBS).
    • PVC: Request for storage by a pod, bound to a PV.
    • Interview Focus: Storage classes, dynamic provisioning, and stateful apps.
    • Relevance: Ties to your disk space management questions (March 24, 2025).
  2. ConfigMap:

    • Stores configuration data (e.g., key-value pairs) for pods, mounted as volumes or environment variables.
    • Interview Focus: Creating ConfigMaps, updating configs, and hot reloading.
    • Relevance: Similar to Spring Boot’s application.properties (April 18, 2025).
  3. Secret:

    • Stores sensitive data (e.g., passwords, API keys) encoded in base64, used by pods.
    • Interview Focus: Creating Secrets, securing access, and integrating with Spring Security.

5. Observability and Scaling

  1. Horizontal Pod Autoscaler (HPA):

    • Scales pod replicas based on metrics (e.g., CPU, memory, custom metrics).
    • Interview Focus: Configuring HPA, metrics server, and custom metrics with Prometheus.
    • Relevance: Aligns with Spring Boot Actuator for monitoring (April 18, 2025).
  2. Prometheus/Grafana:

    • Monitoring tools integrated with K8s for metrics collection and visualization.
    • Interview Focus: Setting up Prometheus, writing queries, and dashboarding with Grafana.
  3. Liveness/Readiness Probes:

    • Health checks to determine if a pod is alive (running) or ready to serve traffic.
    • Interview Focus: Configuring probes, HTTP vs. command probes, and failure handling.

Interview-Relevant Notes

  • Linux and Networking Overlap: Docker and K8s rely on Linux features like namespaces, cgroups, and networking (e.g., TCP, DNS), aligning with your OS (April 18, 2025) and networking (TCP, RDMA) questions. Your Linux disk management queries (March 24, 2025) relate to volumes/PVs.
  • Spring Boot Integration: Docker and K8s are used to deploy Spring Boot microservices (from your April 18, 2025 Spring Boot terms), with Dockerfiles for containerization and K8s Deployments for scaling.
  • Practical Scenarios: Expect tasks like “Write a Dockerfile for a Spring Boot app,” “Design a K8s Deployment with Ingress,” or “Debug a pod networking issue.” Your troubleshooting interest (April 5, 2025, deadlocks) supports debugging K8s issues.
  • RDMA Context: Your RDMA questions (April 8, 11, 2025) suggest interest in high-performance systems; while K8s doesn’t directly use RDMA, container networking (e.g., CNI plugins like Calico) and storage (e.g., NVMe over RDMA) may relate.
  • Microservices and Scalability: Terms like Ingress, Service, and HPA are critical for roles at companies like Google, AWS, or ByteDance, where cloud-native systems are key, complementing your Spring Boot microservices knowledge.

Tips for Preparation

  • Docker Practice: Write a Dockerfile for a Spring Boot app, use Docker Compose for multi-container setups (e.g., app + database), and optimize with multi-stage builds, leveraging your Rust Dockerfile experience (March 28, 2025).
  • K8s Practice: Deploy a Spring Boot app using kubectl, configure Ingress, and set up HPA with Prometheus, building on your microservices interest.
  • Networking and OS: Study container networking (e.g., bridge vs. overlay) and K8s CNI plugins, using your TCP and epoll knowledge (April 18, 2025) to understand pod communication.
  • Debugging: Learn docker logs, kubectl describe, and kubectl logs for troubleshooting, aligning with your April 5, 2025 troubleshooting skills.
  • System Design: Prepare to design a microservices architecture with Docker and K8s, integrating Spring Boot, Ingress, and monitoring, tying to your REST API and cloud-native interests.

If you want a deeper dive into any term (e.g., writing a Dockerfile, configuring K8s Ingress, or debugging pod failures), practice interview questions, or connections to Spring Boot/networking/OS concepts, let me know!

Design Pattern

Design Pattern: Solving the common problem of software designing, it can make the code reusable.

Singleton: One class only has one object, and provide global access point.
Factory: Create objects through factory class, usually through class member functions.
Obverser: Multi-to-one dependence, for example, when the main object’s state changes, it will notifies all the observers.
Strategy: Define some algorithms, enclose to independent classes, allow the dynamic change of the algorithms.