knowledge

Document Sample
knowledge Powered By Docstoc
					Programming language, algorithm, data structure........................................................................................................................... 2
    Encapsulation .......................................................................................................................................................................... 2
    difference between static variable and global variable ............................................................................................................ 3
    string ........................................................................................................................................................................................ 3
    size of a pointer ....................................................................................................................................................................... 3
    differences between #define and enum ................................................................................................................................... 3
    difference between assignment and initialization .................................................................................................................... 3
    Differences between inline functions and #define................................................................................................................... 5
    variable scope .......................................................................................................................................................................... 5
    Tree Traversals ........................................................................................................................................................................ 5
    Tree .......................................................................................................................................................................................... 6
    Can you use memset( ) to set all data members in a class to be 0 ........................................................................................... 6
    A program is started to run, but crashed before gets into main() function. why? .................................................................... 6
    Debug ...................................................................................................................................................................................... 6
    register in C ............................................................................................................................................................................. 6
    .bss (about OSN) .................................................................................................................................................................. 6
    memory leak ............................................................................................................................................................................ 6
    memory overflow .................................................................................................................................................................... 6
    stack ......................................................................................................................................................................................... 7
    heap ......................................................................................................................................................................................... 7
    binary search tree..................................................................................................................................................................... 7
    grammar .................................................................................................................................................................................. 7
    how is the source file compiled to executable? ....................................................................................................................... 7
    Given class a, how repeat all functions of a in class b without using extends......................................................................... 8
    Evaluation strategy .................................................................................................................................................................. 8
    important features of OOP? Why OOP are better? .................................................................................................................. 8
    enum ........................................................................................................................................................................................ 9
    struct and union ....................................................................................................................................................................... 9
    Hash table ................................................................................................................................................................................ 9
    inline function........................................................................................................................................................................ 11
    Dynamic programming language .......................................................................................................................................... 12
    Reflection .............................................................................................................................................................................. 12
    Strongly vs weak typed programming language ................................................................................................................... 12
    Static typing vs Dynamic typing ........................................................................................................................................... 13
    Scripting language ................................................................................................................................................................. 13
    Optimal substructure ............................................................................................................................................................. 13
    Dynamic programming .......................................................................................................................................................... 14
    Greedy algorithm ................................................................................................................................................................... 14
    随机数 ................................................................................................................................................................................... 14
    异或 ....................................................................................................................................................................................... 15
Architecture, system ...................................................................................................................................................................... 15
    How can main thread know that a working thread is finished? ............................................................................................. 15
    multi-cast, broadcast, unicast ................................................................................................................................................ 15
    differences between thread and process ................................................................................................................................ 16
    Advantage of thread over process ......................................................................................................................................... 16
    Thread synchronize ............................................................................................................................................................... 17
    multi-thread ........................................................................................................................................................................... 17
    difference between multithreading and multitasking............................................................................................................. 17
    There are a 4 bytes data supposed to be 1. However, after casting it to int, the value is not correct. Why? ......................... 18
    Follow last question, if int is little-endian, what's value of big-endian? ............................................................................... 18
    Binary semaphore vs. mutex. ................................................................................................................................................ 18
    Mutex vs. semaphore ............................................................................................................................................................. 20
    Real time System ................................................................................................................................................................... 20
    cache memory........................................................................................................................................................................ 20
    Safe State and its use in deadlock avoidance......................................................................................................................... 22
    hard disk and its purpose ....................................................................................................................................................... 22
    Fragmentation ........................................................................................................................................................................ 22
    DRAM. In which form it store data....................................................................................................................................... 22
    Dispatcher.............................................................................................................................................................................. 23
    CPU Scheduler ...................................................................................................................................................................... 23
    Context Switch ...................................................................................................................................................................... 23
    Basic functions of an operating system ................................................................................................................................. 23
    Why paging is used?.............................................................................................................................................................. 23
    What resources used when a thread created? How do they differ from those when a process is created? ............................ 23
    Throughput, Turnaround time, waiting time and Response time .......................................................................................... 24
    Thrashing ............................................................................................................................................................................... 24
    multi tasking, multi programming, multi threading............................................................................................................... 24
    virtual memory ...................................................................................................................................................................... 24
    一程序分 File Input 和 Computing 阶段,发现 Input 耗时久如何改进 .......................................................................... 25
    compare two object ............................................................................................................................................................... 25
    Multiprocessing, multi-core processor, Parallel computing, concurrency and Distributed computing ................................. 25
    embedded system related concepts ........................................................................................................................................ 26
    synchronization and asynchronization................................................................................................................................... 27
    内存组织分布 ....................................................................................................................................................................... 28
Network ......................................................................................................................................................................................... 28
    TCP vs. UDP ......................................................................................................................................................................... 28
    Why use UDP ........................................................................................................................................................................ 29
    Procedure of TCP handshake. format of a TCP header. ........................................................................................................ 29
Linux/unix ..................................................................................................................................................................................... 29
    difference between find and grep .......................................................................................................................................... 29
    Question 1 ............................................................................................................................................................................. 29
Unsolved question ......................................................................................................................................................................... 29




Programming language, algorithm, data structure

Encapsulation

Encapsulation separate contractual interface of an abstraction and its implementation. The internal mechanisms of component can
be improved without impact on other components that supports the same public interface; Encapsulation protects the integrity of
the component, by preventing users from setting the internal data of the component into an invalid or inconsistent state; it reduces
system complexity and thus increases robustness, by limiting interdependencies between software components. for example, a
relational database is encapsulated in the sense that its only public interface is a Query language (SQL for example), which hides
all the internal machinery and data structures of the database management system.
difference between static variable and global variable

static variable is allocated statically, whose lifetime extends across entire run of program. This is in contrast to automatic
variables, whose storage is allocated and deallocated on call stack; and in contrast to objects whose storage is dynamically
allocated. In C and its descendants term static variable has at least two meanings, each related to semantics of C's static
keyword:
* static local variables, which are scoped normally, but have static storage duration
* static global variables, which have the usual static storage duration, but are scoped to the file in which they are defined (as
opposed to external variables declared with the extern keyword).

Static variables are stored in Datasegment in the memory.

a global variable is accessible in every scope. They are usually considered bad practice because of their nonlocality.


string

A string is generally understood as a data type storing a sequence of data values, in which elements usually stand for characters
according to a character encoding, which differentiates it from the more general array data type.


size of a pointer

http://bytes.com/topic/c/answers/216087-size-sizeof-pointer A pointer is a variable that holds an address. The size of a pointer is
the size of this address. For instance, most computers have an address space of 4GB. 32 bits allows you 4GB, so the size of a
pointer will be 32 bits, or 4bytes.


differences between #define and enum

http://www.geekinterview.com/talk/13189-diiference-between-enum-and-define.html: At the present time, there is little
difference. ANSI standard says that enumerations may be freely intermixed with integral types, without errors.

Some advantages of enums are that the numeric values are automatically assigned, a debugger may be able to display the
symbolic values when enum variables are examined, and that they obey block scope. A disadvantage is that the programmer has
little control over the size (reason: Enum is based on some implementation-defined integral type with the only caveat in that it
shall not be larger than sizeof(int) unless the enumerators require a larger range than int can support.).

http://bytes.com/topic/c/answers/522475-difference-between-enum-define-preprocessor: enum respects scope, #define doesn't.


difference between assignment and initialization

Assignment can be done many times where as initialization can be done only once.

http://www.informit.com/articles/article.aspx?p=376876: This difference is not obvious in the built-in types such as int or
double. However, things can be quite different for user-defined types. Consider the following simple, nonstandard string class:

class String {
   public:
        String( const char *init ); // intentionally not explicit!
        ~String();
        String( const String &that );
        String &operator =( const String &that );
        String &operator =( const char *str );
        void swap( String &that );
        friend const String // concatenate
        operator +( const String &, const String & );
        friend bool operator <( const String &, const String & );
        //...
     private:
        String( const char *, const char * ); // computational
        char *s_;
};

Initializing a String object with a character string is straightforward. We allocate a buffer big enough to hold a copy of the
character string and then copy.

String::String( const char *init ) {
     if( !init ) init = "";
     s_ = new char[ strlen(init)+1 ];
     strcpy( s_, init );
}

The destructor does what it does: String::~String() { delete [] s_; }

Assignment is a somewhat more difficult job than construction:

String &String::operator =( const char *str ) {
     if( !str ) str = "";
     char *tmp = strcpy( new char[ strlen(str)+1 ], str );
     delete [] s_;
     s_ = tmp;
     return *this;
}

An assignment is somewhat like destruction followed by a construction. For a user-defined type, the target (left side, or this)
must be cleaned up before it is reinitialized with the source. In the case of our String type, the String's existing character buffer
must be freed before a new character buffer is attached. Because a proper assignment operation cleans up its left argument, one
should never perform a user-defined assignment on uninitialized storage:

String *names = static_cast<String *>(::operator new( BUFSIZ ));
names[0] = "Sakamoto"; // oops! delete [] uninitialized pointer!

In this case, names refers to uninitialized storage because we called operator new directly, avoiding implicit initialization by
String's default constructor; names refers to a hunk of memory filled with random bits. When the String assignment operator is
called in the second line, it will attempt to perform an array delete on an uninitialized pointer.

Because a constructor has less work to do than an assignment operator, an implementation will sometimes employ what's
known as a "computational constructor" for efficiency:
const String operator +( const String &a, const String &b )     { return String( a.s_, b.s_ ); }

The two-argument computational constructor is not intended to be part of the interface of the String class, so it's declared to be
private.

String::String( const char *a, const char *b ) {
     s_ = new char[ strlen(a)+strlen(b)+1 ];
     strcat( strcpy( s_, a ), b );
}


Differences between inline functions and #define

http://www.codeguru.com/forum/showthread.php?t=328273: Inline functions are similar to macros because they both are
expanded at compile time, but the macros are expanded by the preprocessor, while inline functions are parsed by the compiler.
There are several important differences:
* Inline functions follow all the protocols of type safety enforced on normal functions.
* Inline functions are specified using the same syntax as any other function except that they include the inline keyword in the
function declaration.
* Expressions passed as arguments to inline functions are evaluated once. In some cases, expressions passed as arguments to
macros can be evaluated more than once.


variable scope

scope is an enclosing context where values and expressions are associated. Typically, scope is used to define the visibility and
reach of information hiding. Scopes can contain declarations or definitions of identifiers; contain statements and/or expressions
which define an executable algorithm or part thereof; nest or be nested.
Static scoping (lexical scoping): a variable always refers to its top-level environment. This is a property of the program text and
unrelated to the runtime call stack. Static scope is standard in modern functional languages such as ML and Haskell because it
allows the programmer to reason as if variable bindings are carried out by substitution. Static scoping also makes it much easier
to make modular code and reason about it, since its binding structure can be understood in isolation.
Dynamic scoping: each identifier has a global stack of bindings. Introducing a local variable with name x pushes a binding onto
the global x stack (which may have been empty), which is popped off when the control flow leaves the scope. Evaluating x in
any context always yields the top binding. In other words, a global identifier refers to the identifier associated with the most
recent environment. Note that this cannot be done at compile time because the binding stack only exists at runtime.


Tree Traversals

Preorder traverse (Depth-first traversal) perform following operations recursively at each node, starting with root node. :
    1. Visit the node. 2. Traverse the left subtree. 3. Traverse the right subtree.
Inorder traversal (Symmetric traversal.) perform the following operations recursively at each node:
    1. Traverse the left subtree. 2. Visit the node. 3. Traverse the right subtree.
Postorder traversal perform the following operations recursively at each node:
    1. Traverse the left subtree. 2. Traverse the right subtree. 3. Visit the node.
trees can also be traversed in level-order. This is also called Breadth-first traversal.
Tree

a tree is a data structure that emulates a hierarchical tree structure with a set of linked nodes. It is an acyclic connected graph
where each node has a set of zero or more children nodes, and at most one parent node.


Can you use memset( ) to set all data members in a class to be 0

No, use constructor for initialization. memset() is only valid for POD (plain old data), but no if with virtual function.


A program is started to run, but crashed before gets into main() function.
why?

static/global variables are initialized before main(). try to look up those variables.


Debug

When you debug a Java program, you insert a System.out.print to output the debug, but you found the debug is gone. Any
reason? (interviewer mentioned it is related to memory leakage.
Answer:
A: Don't think this can be memory leak, especially in Java. It's more like asynchronization problem. I/O operation reduces the
opportunity of race conditions.If in C++, this could also be memory (stack) corruption, but not very likely relate to memory
leak.
B: I did not see there is much memory leak chance. one possible scenerio is an exception is thrown and the print method is not
completed.


register in C

Register variables are stored in registers in CPUs. Other variables are stored in memory chips. Thus, register is faster.


.bss      (about OSN)

uninitialized data is stored in .bss; initialized data is stored in data segment.


memory leak

A memory leak is a memory consumption where the program is unable to release memory it has acquired. A memory leak can
diminish the performance of the computer by reducing the amount of available memory.


memory overflow

a stack overflow occurs when too much memory is used on the call stack, typically resulting in a program crash. the call stack
contains a limited amount of memory, usually determined at the start of the program.
stack

a stack is a last in, first out (LIFO) abstract data type. A stack can have any abstract data type as an element, but is characterized by
only two operations: push and pop.


heap

a heap is a specialized tree-based data structure that satisfies the heap property: if B is a child node of A, then key(A) ≥ key(B).
This implies that an element with the greatest key is always in the root node, and so such a heap is called a max-heap.
(Alternatively, if the comparison is reversed, the smallest element is always in the root node, which results in a min-heap.) Heaps
are a favorite data structure for many applications.
* Heapsort: One of the best sorting methods being in-place and with no quadratic worst-case scenarios.
* Selection algorithms: Finding the min, max or both of them, median or even any k-th element in sublinear time can be done
dynamically with heaps.
* Graph algorithms: By using heaps as internal traversal data structures, run time will be reduced by an order of polynomial.
Examples of such problems are Prim's minimal spanning tree algorithm and Dijkstra's shortest path problem.


binary search tree

a binary search tree (BST) is a node based binary tree data structure which has the following properties:
    * The left subtree of a node contains only nodes with keys less than the node's key.
    * The right subtree of a node contains only nodes with keys greater than the node's key.
    * Both the left and right subtrees must also be binary search trees.
The major advantage of binary search trees over other data structures is that the related sorting algorithms and search algorithms
such as in-order traversal can be very efficient.


grammar

a grammar is a set of rules that describe which strings formed from the alphabet of a formal language are syntactically valid
within the language. A grammar only addresses the location and manipulation of the strings of the language. It does not
describe anything else about a language, such as its semantics (i.e. what the strings mean). A grammar consists of a set of string
rewriting rules with an assigned start symbol; the language described is the set of strings that can be generated by applying
these rules arbitrarily, starting with the start symbol.

a grammar is usually thought of as a language generator; but it can also be used as the basis for a recognizer that determines for
any given string whether it belongs to the language.

The process of recognizing a string by constructing a combination of applications of rules that generate it is known as parsing.
Most languages have very compositional semantics, i.e. the meaning of their utterances is structured according to their syntax;
therefore, the first step to describing the meaning of an utterance in language is to analyze it and look at its analyzed form
(known as its parse tree in computer science, and as its deep structure in generative grammar).


how is the source file compiled to executable?

http://caml.inria.fr/pub/docs/oreilly-book/html.bak/book-ora065.html: Source program - preprocessing - Source program -
compiling - Assembly program - assembling - Machine instructions- linking - Executable code
Given class a, how repeat all functions of a in class b without using extends

Answer:
A: forward calls to the method of class A. it may be called delegate in c++ or c#
B: Just wrap functions of b in a with the same method name, often seen in decorator pattern or spring dependency injection


Evaluation strategy

In call-by-value, the argument expression is evaluated, and the resulting value is bound to the corresponding variable in the
function (frequently by copying the value into a new memory region). If the function or procedure is able to assign values to its
parameters, only its local copy is assigned — that is, anything passed into a function call is unchanged in the caller's scope
when the function returns. Call-by-value is not a single evaluation strategy, but rather the family of evaluation strategies in
which a function's argument is evaluated before being passed to the function. The term "call-by-value" is sometimes
problematic, as the value implied is not the value of the variable as understood by the ordinary meaning of value, but an
implementation-specific reference to the value. The term "call-by-value where the value is a reference" is common (but should
not be understood as being call-by-reference). Thus the behaviour of call-by-value Java or Visual Basic and call-by-value C or
Pascal are significantly different: in C or Pascal, calling a function with a large structure as an argument will cause the entire
structure to be copied, potentially causing serious performance degradation, and mutations to the structure are invisible to the
caller. However, in Java or Visual Basic only the reference to the structure is copied, which is fast, and mutations to the
structure are visible to the caller.

In call-by-reference evaluation, a function receives an implicit reference to the argument, rather than a copy of its value. This
typically means that the function can modify the argument- something that will be seen by its caller. Call-by-reference therefore
has the advantage of greater time- and space-efficiency (since arguments do not need to be copied), as well as the potential for
greater communication between a function and its caller (since the function can return information using its reference
arguments), but the disadvantage that a function must often take special steps to "protect" values it wishes to pass to other
functions.

Call by name: In call-by-name evaluation, the arguments to functions are not evaluated at all — rather, function arguments are
substituted directly into the function body using capture-avoiding substitution. If the argument is not used in the evaluation of
the function, it is never evaluated; if the argument is used several times, it is re-evaluated each time. Call-by-name evaluation
can be preferable over call-by-value evaluation because call-by-name evaluation always yields a value when a value exists,
whereas call-by-value may not terminate if the function's argument is a non-terminating computation that is not needed to
evaluate the function. Opponents of call-by-name claim that it is significantly slower when the function argument is used, and
that in practice this is almost always the case as a mechanism such as a thunk is needed.

Call-by-need is a memoized version of call-by-name where, if the function argument is evaluated, that value is stored for
subsequent uses. In a "pure" (effect-free) setting, this produces the same results as call-by-name; when the function argument is
used two or more times, call-by-need is almost always faster.

Call-by-macro-expansion is similar to call-by-name, but uses textual substitution rather than capture-avoiding substitution. With
uncautious use, macro substitution may result in variable capture and lead to undesired behavior


important features of OOP? Why OOP are better?

encapsulation, dynamic patch, inheritance, subtype polymorphism, modularity. The methodology focuses on data rather than
processes, with programs composed of self-sufficient modules (objects) each containing all the information needed to
manipulate its own data structure. An object-oriented program may thus be viewed as a collection of cooperating objects, as
opposed to the conventional model, in which a program is seen as a list of tasks (subroutines) to perform.


enum

An enumeration is a type that can hold a set of values specified by the user.


struct and union

A struct is an aggregate of elements of (nearly) arbitrary types. A named union is defined as a struct, where every member has
the same address. A union can have member functions but not static members. In general, a compiler cannot know what
member of a union is used; that is, the type of the object stored in a union is unknown. Consequently, a union may not have
members with constructors or destructors. Unions are best used in low level code, or as part of the implementation of classes
that keep track of what is stored in the union


Hash table

a hash table or hash map uses a hash function to efficiently map certain identifiers or keys to associated values. The hash
function transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to
be sought.

http://www.relisoft.com/book/lang/pointer/8hash.html Before explaining how the hash table works, let me make a little
digression about algorithms that use table lookup. Accessing a table is a very fast operation. So, if we have a function whose
values can be pre-computed and stored in a table, we can trade memory for speed. The isdigit function (macro) is a prime
example of such a tradeoff. The naive implementation would be

inline bool IsDigitSlow (char c){
     return c >= '0' && c <= '9';}

However, if we notice that there can only be 256 different arguments to isdigit, we can pre-compute them all and store in a table.
Let's define the class CharTable that stores the pre-computed values
class CharTable {
public:
        CharTable ();
        bool IsDigit (unsigned char c) { return _tab [c]; }
private: bool _tab [UCHAR_MAX + 1]; // limits.h
};

CharTable::CharTable () {
      for (int i = 0; i <= UCHAR_MAX; ++i)              {
               // use the slow method
               if (i >= '0' && i <= '9')
                        _tab [i] = true;
               else
                        _tab [i] = false;
      }
}

CharTable TheCharTable;
Now we could quickly find out whether a given character is a digit by calling
TheCharTable.IsDigit (c)

In reality the isdigit macro is implemented using a lookup of a statically initialized table of bit fields, where every bit
corresponds to one property, such as being a digit, being a white space, being an alphanumeric character, etc.

The hash table data structure is based on the idea of using table lookup to speed up an arbitrary mapping. For our purposes, we
are interested in mapping strings into integers. We cannot use strings directly as indices into an array. However, we can define
an auxiliary function that converts strings into such indices. Such a function is called a hash function. Thus we could imagine a
two-step process to map a string into an integer: for a given string calculate the hash function and then use the result to access
an array that contains the pre-computed value of the mapping at that offset.

Such hashing, called perfect hashing, is usually difficult to implement. In the imperfect world we are usually satisfied with a
flawed hash function that may occasionally map two or more different strings into the same index. Such situation is called a
collision. Because of collisions, the hash table maps a string not into a single value but rather into a "short list" of candidates.
By further searching this list we can find the string we are interested in, together with the value into which it is mapped.




Figure 13. The hash function for the string One is the same as for the string Three. The collision is dealt with by creating a short
list that contains the id's for both strings.

This algorithm becomes efficient when the number of strings to be mapped is large enough. Direct linear search among N
strings would require, on average, N/2 comparisons. On the other hand, if the size of the hash table is larger than N, the search
requires, on average, one comparison (plus the calculation of the hash function). For instance, in our string table we can store at
most 100 strings. Finding a given string directly in such a table would require, on average, 50 string comparisons. If we spread
these strings in a 127-entry array using a hashing function that randomizes the strings reasonably well, we can expect slightly
more than one comparison on the average. That's a significant improvement.

Here is the definition of the class HashTable . The table itself is an array of lists (these are the "short lists" we were talking
about). Most of them will contain zero or one element. In the rare case of a conflict, that is, two or more strings hashed into the
same index, some lists may be longer than that.

const int sizeHTable = 127;
// Hash table of strings
class HTable {
public:
        List const & Find (char const * str) const;       // return a short list of candidates
        void Add (char const * str, int id);      // add another string->id mapping
private:
        int hash (char const * str) const;        // the hashing function
        List _aList [sizeHTable]; // an array of (short) lists
};

// Find the list in the hash table that may contain the id of the string we are looking for
List const & HTable::Find (char const * str) const {
         int i = hash (str);
         return _aList [i];
}

void HTable::Add (char const * str, int id) {
      int i = hash (str);
      _aList [i].Add (id);
}

The choice of a hash function is important. We don't want to have too many conflicts. The shift-and-add algorithm is one of the
best string randomizers.

int HTable::hash (char const * str) const {
       // no empty strings, please
       assert (str != 0 && str [0] != 0);
       unsigned h = str [0];
       for (int i = 1; str [i] != 0; ++i)
                h = (h << 4) + str [i];
       return h % sizeHTable; // remainder
}

The expression h << 4 is equal to h shifted left by 4 bits (that is multiplied by 16).
In the last step in the hashing algorithm we calculate the remainder of the division of h by the size of the hash table. This value
can be used directly as an index into the array of sizeHTable entries. The size of the table is also important. Powers of 2 are
worst--they create a lot of conflicts; prime numbers are best. Usually a power of 2 plus or minus one will do. In our case 127 =
27 - 1, which happens to be a prime number.
The hash function of the string "One" is 114. It is calculated as follows

                                                char    ASCII          h

                                                'O'     0x4F           0x4F

                                                'n'     0x6E           0x55E

                                                'e'     0x65           0x5645


The remainder of division of h by 127 is 114, so the id of string "One" will be stored at offset 114 in the hash table array.


inline function

an inline function is used to tell a compiler it should perform in-line expansion on a particular function. In other words, the
compiler will insert the complete body of the function in every place in the code where that function is used.
Dynamic programming language

Dynamic programming language execute at runtime many common behaviors that other languages might perform during
compilation. These behaviors could include extension of the program, by adding new code, by extending objects and definitions,
or by modifying the type system, all during program execution. These behaviors can be emulated in nearly any language of
sufficient complexity, but dynamic languages provide direct tools to make use of them. Most dynamic languages are
dynamically typed, but not all.
Its implementation:
 Higher-order functions
 Object runtime alteration
 Closures: One of the most widely used aspects of functional programming is closure, which allows creating a new instance
of a function which retains access to the context in which it was created.
 Continuations: Continuations represent execution state that can be re-invoked. For example, a parser might return an
intermediate result and a continuation that, when invoked, will continue to parse the input. Continuations interact in very
complex ways with scoping, especially with respect to closures. For this reason, many dynamic languages do not provide
continuations.
 Reflection: Reflection involves analysis of the types and metadata of generic or polymorphic data. It can also include full
evaluation and modification of a program's code as data, such as the features that Lisp provides in analyzing S-expressions.
 Macros: A limited number of dynamic programming languages provide features which combine code introspection and
eval in a feature called macros. In C or C++, macros are a static feature which are built in a small subset of the language, and
are capable only of string substitutions on the text of the program. In dynamic languages, however, they provide access to the
inner workings of the compiler, and full access to the interpreter, virtual machine, or runtime, allowing the definition of
language-like constructs which can optimize code or modify the syntax or grammar of the language.


Reflection

reflection is the process by which a computer program can observe and modify its own structure and behavior. It is a particular
kind of metaprogramming.
In many computer architectures, program instructions are stored as data - hence the distinction between instruction and data is
merely a matter of how the information is treated by computer and programming language. In some languages, programs can
also treat instructions as data and therefore make reflective modifications. Reflection is most commonly used in high-level
virtual machine programming languages like Smalltalk and scripting languages, and less commonly used in manifestly typed
and/or statically typed languages such as Java and C.


Strongly vs weak typed programming language

the term strong typing is used to describe those situations where programming languages specify one or more restrictions on
how operations involving values having different data types can be intermixed. preventing the compiling or running of source
code which uses data in what is considered to be an invalid way. For instance, an integer division operation may not be used
upon strings; a procedure which operates upon linked lists may not be used upon numbers. However, the nature and strength of
these restrictions is highly variable.Luca Cardelli describes strong typing simply as the absence of unchecked run-time type
errors.

weakly typed programming languages are those that support either implicit type conversion (nearly all languages support at
least one implicit type conversion), ad-hoc polymorphism (also known as overloading) or both. These less restrictive usage
rules can give the impression that strict adherence to typing rules is less important than in strongly typed languages and hence
that the type system is "weaker". However, such languages usually have restrictions on what programmers can do with values
of a given type; thus it is possible for a weakly typed language to be type safe. Moreover, weakly typed languages may be
statically typed, in which case overloading is resolved statically and type conversion operations are inserted by the compiler, or
dynamically typed, in which case everything is resolved at run time.


Static typing vs Dynamic typing

static typing: type checking is performed during compile-time. In static typing, types are associated with variables not values.
Dynamically typed is when the majority of its type checking is performed at run-time. In dynamic typing, types are associated
with values not variables.

Compared to static typing, dynamic typing can be more flexible (e.g. by allowing programs to generate types and functionality
based on run-time data), though at the expense of fewer a priori guarantees. This is because a dynamically typed language
accepts and attempts to execute some programs which may be ruled as invalid by a static type checker. Dynamic typing may
result in runtime type errors. This operation may occur long after the place where the programming mistake was made—that is,
the place where the wrong type of data passed into a place it should not have. This makes the bug difficult to locate.

Dynamically typed language systems, compared to their statically typed cousins, make fewer "compile-time" checks on the
source code. Run-time checks can potentially be more sophisticated, since they can use dynamic information as well as any
information that was present during compilation. On the other hand, runtime checks only assert that conditions hold in a
particular execution of the program, and these checks are repeated for every execution of the program.

Development in dynamically typed languages is often supported by programming practices such as unit testing. In practice, the
testing done to ensure correct program operation can detect a much wider range of errors than static type-checking, but
conversely cannot search as comprehensively for the errors that both testing and static type checking are able to detect.

Combinations of dynamic and static typing: The presence of static typing in a programming language does not necessarily
imply the absence of all dynamic typing mechanisms. For example, Java, and various other object-oriented languages, while
using static typing, require for certain operations (downcasting) the support of runtime type tests, a form of dynamic typing.


Scripting language

script language or extension language allows control of one or more software applications. "Scripts" are distinct from the core
code of the application, which is usually written in a different language, and are often created or at least modified by the
end-user. Scripts are often interpreted from source code or bytecode, whereas the applications they control are traditionally
compiled to native machine code.


Optimal substructure

a problem is said to have optimal substructure if an optimal solution can be constructed efficiently from optimal solutions to its
subproblems. This property is used to determine the usefulness of dynamic programming and greedy algorithms in a problem.

Typically, a greedy algorithm is used to solve a problem with optimal substructure if it can be proved by induction that this is
optimal at each step. Otherwise, providing the problem exhibits overlapping subproblems as well, dynamic programming is
used. If there are no appropriate greedy algorithms and the problem fails to exhibit overlapping subproblems, often a lengthy
but straightforward search of the solution space is the best alternative.

A slightly more formal definition of optimal substructure can be given. Let a "problem" be a collection of "alternatives", and let
each alternative have an associated cost, c(a). The task is to find a set of alternatives that minimizes c(a). Suppose that the
alternatives can be partitioned into subsets, where each subset has its own cost function, and each alternative belongs to only
one subset. The minima of each of these cost functions can be found, as can the minima of the global cost function, restricted to
the same subsets. If these minima match for each subset, then it's almost obvious that a global minimum can be picked not out
of the full set of alternatives, but out of only the set that consists of the minima of the smaller, local cost functions we have
defined. If minimizing the local functions is a problem of "lower order", and (specifically) if, after a finite number of these
reductions, the problem becomes trivial, then the problem has an optimal substructure.


Dynamic programming

dynamic programming is mainly used to tackle problems that are solvable in polynomial time. There are two key attributes that
a problem must have in order for dynamic programming to be applicable: optimal substructure and overlapping subproblems.
Overlapping subproblems means that the space of subproblems must be small, that is, any recursive algorithm solving the
problem should solve the same subproblems over and over, rather than generating new subproblems. This can be achieved in
either of two ways:
* Top-down approach: This is the direct fall-out of the recursive formulation of any problem. If the solution to any problem can
be formulated recursively using the solution to its subproblems, and if its subproblems are overlapping, then one can easily
memoize or store the solutions to the subproblems in a table. Whenever we attempt to solve a new subproblem, we first check
the table to see if it is already solved. If a solution has been recorded, we can use it directly, otherwise we solve the subproblem
and add its solution to the table.
* Bottom-up approach: This is the more interesting case. Once we formulate the solution to a problem recursively as in terms of
its subproblems, we can try reformulating the problem in a bottom-up fashion: try solving the subproblems first and use their
solutions to build-on and arrive at solutions to bigger subproblems. This is also usually done in a tabular form by iteratively
generating solutions to bigger and bigger subproblems by using the solutions to small subproblems.


Greedy algorithm

A greedy algorithm makes the locally optimal choice at each stage with the hope of finding the global optimum. For example,
applying the greedy strategy to the traveling salesman problem yields the following algorithm: "At each stage visit the unvisited
city nearest to the current city". In general, greedy algorithms have five pillars:
    1. A candidate set, from which a solution is created
    2. A selection function, which chooses the best candidate to be added to the solution
    3. A feasibility function, that is used to determine if a candidate can be used to contribute to a solution
    4. An objective function, which assigns a value to a solution, or a partial solution, and
    5. A solution function, which will indicate when we have discovered a complete solution


随机数

http://www.wangchao.net.cn/bbsdetail_61249.html "随机数的困惑(java.util.Random/Math.Random()"
綠起:想在 J2ME 中产生一个随机的潜艇出现的位置,其实也只需要一个 Y 坐标而已。用了 nextInt 方法,代码类似如
下:奇怪的是要得到两个随机的数,得到的却是两个相同的数而已
      class T{
        int x;
        java.util.Random r=new java.util.Random();
        T() {                                                   、
                    x=(r.nextInt() >>> 1) %10; //产生一个 0-9 之间一个数、、       }
        int getT() {       return this.x;//得到这个随机数        }
      }
      public class TR {
        public static void main(String[] args) {
        for(int i=0;i<2;i++) {
      T t=new T();
      System.out.println(t.getT());//输出这个随机数,但结果很大程度上是一样的、、
      }
      }
  }
于是找到一些关于随机数的资料:得到结论如下:导致随机数一样的原因为:因为随机数用的是当前时间做随机种子,
就是 new Random(时间为种子的),所以程序出现的太快会出现一样的原因。解决办法:在产生随机数的时候让他隔个
一段时间,如:try {Thread.sleep(100);} catch(Exception e) {} 产生的就没有问题了。

关于随机数:产生随机数的方法:
1。用 Random()之后调用 nextInt()来产生随机数,至于想要产生自己想要的范围就要做一下手脚 了,取正取模就可以
了 如:(rand.nextInt()>>>1 ) % 101
2。其实用 nextInt()这个方法已经很老套了,      而且如果一般用作产生一定范围的随机数的话 (用上面的方法,    ,
                                                             取正取模)
就会出现数值偏小的情况,所以 JDK 升级的版本的用法就为 nextInt(100)产生 0-100 之间的随机数:
3。其实还有一个方法就是 Math.random()来产生一个 0-1 之间的浮点数:之后你就可以*你要的范围了,取整就可以,
这个方法可以有效的避免上面重复的问题,
总结:一般最好用 nextInt(范围)的,如果不是大量的运算的话,就可以用 Math.random 了, (因为他可以运算起来慢一
些,浮点之后再取整)         ,像我没有办法只好用第一个方法了


异或

http://www.mitbbs.com/article_t/JobHunting/31510205.html "About 异或"
The following is from baidu: http://baike.baidu.com/view/1452266.html?fromTaglist
异或还可以用来交换两个整形变量的值,而不需要第三个量的传递.
      例如:
      a=9;
      b=10;
      a=a^b;
      b=b^a;
      a=a^b;
      结果是 a 为 10,b 为 9.




Architecture, system

How can main thread know that a working thread is finished?

use event to notify the complete of a working thread.


multi-cast, broadcast, unicast

unicast transmission is the sending of information packets to a single network destination. Unicast messaging is used for all
network processes where a private or unique resource is requested making most networking traffic Unicast in form. Unicast is
used where two way connections are needed to complete the network transaction.

broadcasting refers to transmitting a packet that will be received (conceptually) by every device on the network. Not all
computer networks support broadcasting; for example, neither X.25 nor frame relay supply a broadcast capability, nor is there
any form of Internet-wide broadcast. Broadcasting is largely confined to local area network (LAN) technologies, most notably
Ethernet and Token Ring, where the performance impact of broadcasting is not as large as it would be in a wide area network.

Multicast addressing is a network technology for the delivery of information to a group of destinations simultaneously using the
most efficient strategy to deliver the messages over each link of the network only once, creating copies only when the links to
the multiple destinations split. The word "multicast" is typically used to refer to IP multicast which is often employed for
streaming media and Internet television applications.


differences between thread and process

http://wiki.answers.com/Q/What_is_the_difference_between_processes_and_threads: A Process is the memory set aside for an
application to be executed in. Within this process the thing, which is really executed is the thread. The key difference is that
processes are fully isolated from each other; threads share (heap) memory with other threads running in the same application.
Threads have direct access to the data segment of its process; processes have their own copy of the data segment of the parent
process.
Threads can directly communicate with other threads of its process; processes must use inter-process communication to
communicate with sibling processes.
Threads have almost no overhead; processes have considerable overhead.
New threads are easily created; new processes require duplication of the parent process.
Threads can exercise considerable control over threads of the same process; processes can only exercise control over child
processes.

http://www.geekinterview.com/question_details/16639: Changes to the main thread (cancellation, priority change, etc.) may
affect the behavior of the other threads of the process, changes to the parent process does not affect child processes.

Both have an id, set of registers, state, priority, and scheduling policy; Both have attributes that describe the entity to the
OS;Both have an information block; Both share resources with the parent process; Both function as independent entities from
the parent process; The creator can exercise some control over the thread or process; Both can change their attributes.; Both can
create new resources.; Neither can access the resources of another process.

Parent & child process have different Code Data & Test segments. But two threads of the same process share the Code & Data
segments and have separate stacks.

A thread is a stream of instructions which can be scheduled independently(i.e it has its own program counter and stack).But a
thread shares its resources like program code directories and global data with the calling process.A process on the other hand
has its own copy of both resources and scheduling information.A process can have many threads basically threads are called
light weight processes.

Operating system is aware of process NOT thread. Threads are only visible in processes created by the user within user space.


Advantage of thread over process

http://www.zdv.uni-mainz.de/cms-extern/DUS/progtool/dce31unx/develop/appdev/Appde153.htm: With a threads package, a
programmer can create multiple threads within a process. Threads execute concurrently and, within a multithreaded process,
there are at any time multiple points of execution. Threads execute within a single address space. Multithreaded programming
offers the following advantages:
· Performance: Threads improve the performance (throughput, computational speed, responsiveness, or some combination of
these) of a program. Multiple threads are useful in a multiprocessor system where threads run concurrently on separate
processors. In addition, multiple threads also improve program performance on single processor systems by permitting the
overlap of input and output or other slow operations with computational operations.You can think of threads as executing
simultaneously, regardless of the number of processors present. You cannot make any assumptions about the start or finish
times of threads or the sequence in which they execute, unless explicitly synchronized.
· Shared Resources: An advantage of using multiple threads over using separate processes is that the former share a single
address space, all open files, and other resources.
· Potential Simplicity: Multiple threads can reduce the complexity of some applications that are inherently suited for threads.


Thread synchronize

1.   Mutex: named mutex can work cross process, set timeout, expensive (user mode to kernel mode)
2.   Critical section: unable to work cross processes
3.   Semaphore: multithread can access resource simultaneously.(P: decrease, posses resource, V: increase, release)
4.   Shared memory:
5.   Socket, queue/pipe


multi-thread

multithreading is the process of creating threads and assigning task to each thread and each thread works simultaneously and
thus increasing the efficiencey


difference between multithreading and multitasking

A task is "an execution path through address space". In other words, a set of program instructions that are loaded in memory.
The address registers have been loaded with the initial address of the program. At the next clock cycle, the CPU will start
execution, in accord with the program. The sense is that some part of 'a plan is being accomplished'. As long as the program
remains in this part of the address space, the task can continue, in principle, indefinitely, unless the program instructions contain
a halt, exit, or return.. "task" has the sense of a real-time application, as distinguished from process, which takes up space
(memory), and execution time.

Both "task" and "process" should be distinguished from event, which takes place at a specific time and place, and which can be
planned for in a computer program.

multitasking is a method by which multiple tasks, also known as processes, share common processing resources such as a CPU.
In the case of a computer with a single CPU, only one task is said to be running at any point in time, meaning that the CPU is
actively executing instructions for that task. Multitasking solves the problem by scheduling which task may be the one running
at any given time, and when another waiting task gets a turn. The act of reassigning a CPU from one task to another one is
called a context switch. When context switches occur frequently enough the illusion of parallelism is achieved. Even on
computers with more than one CPU (called multiprocessor machines), multitasking allows many more tasks to be run than there
are CPUs.

As multitasking greatly improved the throughput of computers, programmers started to implement applications as sets of
cooperating processes (e.g. one process gathering input data, one process processing input data, one process writing out results
on disk.) This, however, required some tools to allow processes to efficiently exchange data.

Threads were born from the idea that the most efficient way for cooperating processes to exchange data would be to share their
entire memory space. Thus, threads are basically processes that run in the same memory context. Threads are described as
lightweight because switching between threads does not involve changing the memory context.
There are a 4 bytes data supposed to be 1. However, after casting it to int, the
value is not correct. Why?

This is an endian problem. The MSB for big-endian numbers is the left most byte. Thus, if the data is big-endian but we
interpret it as little-endian, the number is messed up.


Follow last question, if int is little-endian, what's value of big-endian?

Answer: 224.


Binary semaphore vs. mutex.

a) It seems that the mutex is actually implemented by binary semaphore.
b) Maybe semaphore is applicable to IPC (inter-process communication)                    but   mutex    is   not.   See:
http://en.wikipedia.org/w/index.php?title=Inter-process_communication&oldid=82483721"
只是像而已 Mutex guards semaphore values。
******semaphore*********
Struct semaphore {
      int val; /* only 0 or 1 */
      proc *queue;
}

waitB (S){
    if ( S.val == 0 ) {
          append(S.queue,Proc);
          block(Proc);
          /* Proc resumes here when unblocked */
    } else {       S.val--; }
}

signalB(S){
     if ( S.val == 0 ) {
           if ( S.queue == NULL ) { S.val++;
           } else {
                 /* a process was waiting */
                 resume(S.queue);
           }
     }
}
************************

******mutex*************
struct mutex {
     int owner; /* initialized to NULL */
     proc *queue;
}
enter(M){
     if (M.owner != NULL ) {
          append(M.queue,Proc);
          block(Proc);
     }
     M.owner = Proc;
}

exit(M){
     if ( M.owner == Proc ) {
           if ( M.queue == NULL ) M.owner = NULL;
           else resume(M.queue);
     } else error(―Not owner‖);
}
*********************************

******semaphore with mutex*************
struct semaphore {
     struct mutex m;
     int val;
     proc *queue;
}

waitC(S){
    enter(S.m);
    if ( S.val == 0 ) {
          append(S.queue,Proc);
          exit(S.m);
          block(Proc);
          enter(S.m);
    }
    S.val--;
    exit(S.m);
}

signalC(S){
     enter(S.m);
     S.val++;
     if ( S.queue != NULL ) {
           newP = dequeue(S.queue);
           exit(S.m);
           /* a process was waiting */
           resume(newP);
     } else
           exit(S.m);
}
Mutex vs. semaphore

Mutex allows one thread to get access to a resource at each particular time. Semaphore allows a limited number of threads to
get access to the resource at each time instance.


Real time System

A real time process is a process that must respond to the events within a certain time period. A real time operating system is an
operating system that can run real time processes successfully. Often used as a control device in a dedicated application such as
controlling scientific experiments, medical imaging systems, and some display systems.

Real-Time systems may be either hard or soft real-time. Hard real-time: Secondary storage limited or absent, data stored in
short term memory, or read-only memory (ROM), Conflicts with time-sharing systems, not supported by general-purpose
operating systems. Soft real-time: Limited utility in industrial control of robotics, Useful in applications (multimedia, virtual
reality) requiring advanced operating-system features.

A hard real-time system guarantees that critical tasks complete on time. This goal requires that all delays in the system be
bounded from the retrieval of the stored data to the time that it takes the operating system to finish any request made of it. A
soft real time system where a critical real-time task gets priority over other tasks and retains that priority until it completes. As
in hard real time systems kernel delays need to be bounded


cache memory

http://faq.javaranch.com/view?CachingStrategies: cache is an amount of faster memory used to improve data access by storing
portions of a data set the whole of which is slower to access. At the other extreme, most modern processors run at a much
higher clock speed than the main memory of the computers they are in; in order to avoid being slowed down to the speed of
main memory every time they have to access some data, processors typically have higher speed caches to store the data they are
currently working with. Sometimes the total data set isn't actually stored at all; instead, each data item is calculated as necessary,
in which case the cache stores results from the calculations.

How does a cache work?: When a datum is needed, the cache is checked to see if it contains the datum. If it does, the datum is
used from the cache, without having to access the main data store. This is known as a 'cache hit'. If the datum is not in the cache,
it is transferred from the data store; this is known as a 'cache miss'. When the cache fills up, items are ejected from the cache to
make space for new items.

When should my software use a cache?: If you work with a large data store, a cache may speed up your software. If you
perform calculations that may be expensive relative to the amount of data produced - for example, complex database queries -
caching some of the results may provide benefits. As with all optimizations, it's best to thoroughly understand where your
software spends its time before implementing a cache. It's usually a bad idea to implement a cache that simply duplicates the
purpose of an existing cache.

What issues should I consider when designing a cache?: Probably the key design issue for a cache is selection of the algorithm
used to decide which data to retain in the cache. To decide which algorithm is best for your purposes, it's good to know your
data access patterns. Key issues are whether your data access is temporally clustered - that is, whether data that has been
recently accessed is more likely to be accessed again; whether there tend to be scans - sequential accesses - to significant
chunks of the data, as opposed to random accesses; and how uniform the frequency of access is to various items in the data set.
In addition, some algorithms are more complex to implement than others; some require more or less calculational and memory
overhead than others; and some have parameters that need to be tuned to get good performance. If the main data set can be
modified by multiple clients each maintaining their own cache, you may also have to worry about cache coherency - whether
the different clients' caches have consistent, 'coherent' views of the main data set.

Some popular algorithms are FIFO, LRU, LFU, LRU2, 2Q and time-based expiration. Most of the titles are based on the
strategies used to eject items from cache when the cache gets full, except the time-based expiration algorithms.
 FIFO (First In First Out): Items are added to the cache as they are accessed, putting them in a queue or buffer and not
      changing their location in the buffer; when the cache is full, items are ejected in the order they were added. Cache access
      overhead is constant time regardless of the size of the cache. The advantage of this algorithm is that it's simple and fast; it
      can be implemented using just an array and an index. The disadvantage is that it's not very smart; it doesn't make any effort
      to keep more commonly used items in cache. Summary: fast, not adaptive, not scan resistant
 LRU - (Least Recently Used): Items are added to the cache as they are accessed; when the cache is full, the least recently
      used item is ejected. This type of cache is typically implemented as a linked list, so that an item in cache, when it is
      accessed again, can be moved back up to the head of the queue; items are ejected from the tail of the queue. Cache access
      overhead is again constant time. This algorithm is simple and fast, and it has a significant advantage over FIFO in being
      able to adapt somewhat to the data access pattern; frequently used items are less likely to be ejected from the cache. The
      main disadvantage is that it can still get filled up with items that are unlikely to be reaccessed soon; in particular, it can
      become useless in the face of scans over a larger number of items than fit in the cache. Nonetheless, this is by far the most
      frequently used caching algorithm. Summary: fast, adaptive, not scan resistant
 LRU2 - (Least Recently Used Twice): Items are added to the main cache the second time they are accessed; when the
      cache is full, the item whose second most recent access is ejected. Because of the need to track the two most recent
      accesses, access overhead increases logarithmically with cache size, which can be a disadvantage. In addition, accesses
      have to be tracked for some items not yet in the cache. There may also be a second, smaller, time limited cache to capture
      temporally clustered accesses, but the optimal size of this cache relative to the main cache depends strongly on the data
      access pattern, so there's some tuning effort involved. The advantage is that it adapts to changing data patterns, like LRU,
      and in addition won't fill up from scanning accesses, since items aren't retained in the main cache unless they've been
      accessed more than once. Summary: not especially fast, adaptive, scan resistant
 2Q - (Two Queues): Items are added to an LRU cache as they are accessed. If accessed again, they are moved to a second,
      larger, LRU cache. Items are typically ejected so as to keep the first cache at about 1/3 the size of the second. This
      algorithm attempts to provide the advantages of LRU2 while keeping cache access overhead constant, rather than having it
      increase with cache size. Published data seems to indicate that it largely succeeds. Summary: fairly fast, adaptive, scan
      resistant
 LFU - (Least Frequently Used): Frequency of use data is kept on all items. The most frequently used items are kept in the
      cache. Because of the bookkeeping requirements, cache access overhead increases logarithmically with cache size; in
      addition, data needs to be kept on all items whether or not in the cache. The advantage is that long term usage patterns are
      captured well, incidentally making the algorithm scan resistant as well; the disadvantage, besides the larger access
      overhead, is that the algorithm doesn't adapt quickly to changing usage patterns, and in particular doesn't help with
      temporally clustered accesses. Note: This is sometimes referred to as "perfect LFU", which is in contrast to "in cache
      LFU". The latter retains frequency of use data only on items that are already in the cache, and generally does not peform as
      well. Summary: not fast, captures frequency of use, scan resistant
 Simple time-based expiration - Data in the cache is invalidated based on absolute time periods. Items are added to the
      cache, and remains in the cache for a specific amount of time. Summary: Fast, not adaptive, not scan resistant.
 Extended time-based expiration - Data in the cache is invalidated based on relative time periods. Items are added to the
      cache, and remains in the cache until they are invalidated at certain points in time, sucha as everi five minutes, each day at
      12.00 etc. Summary: Fast, not adaptive, not scan resistant.
 Sliding time-based expiration - Data in the cache is invalidated by specifying the amount of time the item is allowed to be
      idle in the cache after last access time. Summary : Fast, adaptive, not scan resistant.
 Working set - Based on Dr Peter Denning's classic "Working Set" paper from ACM Computing Surveys (CSUR) Volume
      2 , Issue 3 (September 1970 Data in the cache marked with a flag for every access. The cache is periodically checked,
      recently access members are considered part of the "working set". Members not in the working set are candidates for
      removal. Size of cache is not defined directly, rather the frequency of the periodic checks indirectly controls how many
     items are deleted. Summaryt: Fast, adaptive, theoretically near optimal, not scan resistant.
    Other algorithms - there are other caching algorithms available that have been tested in published papers. Some of the
     popular ones include CLOCK, GCLOCK, and LRD (Least Reference Density). Of possible interest is IBM's ARC
     (Adaptive                  Replacement                   Cache)                 web              page               at
     http://www.almaden.ibm.com/StorageSystems/autonomic_storage/ARC/index.shtml - in particular the first link to a PDF
     format paper ("ARC: A Self-Tuning, Low Overhead Replacement Cache"), which includes some useful tables giving
     overhead times and hit ratios as functions of cache size and some other parameters.

Is there publicly available Java caching code? Yes. Sites offering open source caching code include:
URL        Comments       License Last release
http://swarmcache.sourceforge.net/       Distributed cache LGPL October 2003
http://jocache.sourceforge.net/     FIFO, LRU, LFU LGPL February 2004
http://jcache.sourceforge.net Time-based LGPL February 2005
http://cache4j.sourceforge.net/     -    BSD       February 2006
https://whirlycache.dev.java.net/ -      Apache May 2006
http://jakarta.apache.org/jcs/ -    Apache June 2007
http://www.opensymphony.com/oscache           -    modified Apache July 2007
http://ehcache.org/ -     Apache October 2009
http://jboss.org/jbosscache/ -      LGPL October 2009
http://code.google.com/p/sccache/ -      Apache October 2009


Safe State and its use in deadlock avoidance

When a process requests an available resource, system must decide if immediate allocation leaves the system in a safe state.
System is in safe state if there exists a safe sequence of all processes. Deadlock Avoidance: ensure that a system will never enter
an unsafe state.


hard disk and its purpose

Hard disk is the secondary storage device, which holds the data in bulk, and it holds the data on the magnetic medium of the
disk.Hard disks have a hard platter that holds the magnetic medium, the magnetic medium can be easily erased and rewritten,


Fragmentation

Fragmentation occurs in a dynamic memory allocation system when many of the free blocks are too small to satisfy any request.
External Fragmentation happens when a dynamic memory allocation algorithm allocates some memory and a small piece is left
over that cannot be effectively used. If too much external fragmentation occurs, the amount of usable memory is drastically
reduced. Total memory space exists to satisfy a request, but it is not contiguous. Internal fragmentation is the space wasted
inside of allocated memory blocks because of restriction on the allowed sizes of allocated blocks. Allocated memory may be
slightly larger than requested memory; this size difference is memory internal to a partition, but not being used.


DRAM. In which form it store data

DRAM is not the best, but it’s cheap, does the job. DRAM data resides in a cell made of a capacitor and a transistor. The
capacitor tends to lose data unless it’s recharged every couple of milliseconds, and this recharging tends to slow down the
performance of DRAM compared to speedier RAM types.
Dispatcher

Dispatcher module gives control of the CPU to the process selected by the short-term scheduler; this involves: Switching
context, Switching to user mode, Jumping to the proper location in the user program to restart that program, dispatch latency –
time it takes for the dispatcher to stop one process and start another running.


CPU Scheduler

Selects from among the processes in memory that are ready to execute, and allocates the CPU to one of them. CPU scheduling
decisions may take place when a process: 1.Switches from running to waiting state. 2.Switches from running to ready state.
3.Switches from waiting to ready. 4.Terminates. Scheduling under 1 and 4 is non-preemptive. All other scheduling is
preemptive.


Context Switch

Switching the CPU to another process requires saving the state of the old process and loading the saved state for the new
process. This task is known as a context switch. Context-switch time is pure overhead, because the system does no useful work
while switching. Its speed varies from machine to machine, depending on the memory speed, the number of registers which
must be copied, the existed of special instructions(such as a single instruction to load or store all registers).


Basic functions of an operating system

Operating system controls and coordinates the use of the hardware among the various applications programs for various uses.
Operating system acts as resource allocator and manager. Since there are many possibly conflicting requests for resources the
operating system must decide which requests are allocated resources to operating the computer system efficiently and fairly.
Also operating system is control program which controls the user programs to prevent errors and improper use of the computer.
It is especially concerned with the operation and control of I/O devices.


Why paging is used?

Paging is solution to external fragmentation problem which is to permit the logical address space of a process to be
noncontiguous, thus allowing a process to be allocating physical memory wherever the latter is available.


What resources used when a thread created? How do they differ from those
when a process is created?

When a thread is created the threads does not require any new resources to execute the thread shares the resources like memory
of the process to which they belong to. The benefit of code sharing is that it allows an application to have several different
threads of activity all within the same address space. Whereas if a new process creation is very heavyweight because it always
requires new address space to be created and even if they share the memory then the inter process communication is expensive
when compared to the communication between the threads.
Throughput, Turnaround time, waiting time and Response time

Throughput – number of processes that complete their execution per time unit. Turnaround time – amount of time to execute a
particular process. Waiting time – amount of time a process has been waiting in the ready queue. Response time – amount of
time it takes from when a request was submitted until the first response is produced, not output (for time-sharing environment).


Thrashing

Thrashing is caused by under allocation of the minimum number of pages required by a process, forcing it to continuously page
fault. The system can detect thrashing by evaluating the level of CPU utilization as compared to the level of multiprogramming.
It can be eliminated by reducing the level of multiprogramming.


multi tasking, multi programming, multi threading

Multiprogramming is the technique of running several programs at a time using timesharing. Multiprogramming creates logical
parallelism. the operating system keeps several jobs in memory simultaneously. The operating system selects a job from the job
pool and starts executing a job, when that job needs to wait for any i/o operations the CPU is switched to another job. So the
main idea here is that the CPU is never idle.
Multitasking is the logical extension of multiprogramming .The concept of multitasking is quite similar to multiprogramming
but difference is that the switching between jobs occurs so frequently that the users can interact with each program while it is
running. This concept is also known as time-sharing systems. A time-shared operating system uses CPU scheduling and
multiprogramming to provide each user with a small portion of time-shared system.
Multi threading: An application typically is implemented as a separate process with several threads of control. In some
situations a single application may be required to perform several similar tasks for example a web server accepts client requests
for web pages, images, sound, and so forth. A busy web server may have several of clients concurrently accessing it. If the web
server ran as a traditional single-threaded process, it would be able to service only one client at a time. The amount of time that
a client might have to wait for its request to be serviced could be enormous. So it is efficient to have one process that contains
multiple threads to serve the same purpose. This approach would multithread the web-server process, the server would create a
separate thread that would listen for client requests when a request was made rather than creating another process it would
create another thread to service the request. To get the advantages like responsiveness, Resource sharing economy and
utilization of multiprocessor architectures multithreading concept can be used.


virtual memory

Virtual memory gives an application program the impression that it has contiguous working memory (an address space), while
in fact may be physically fragmented and may even overflow on to disk storage. Systems that use this technique make
programming of large applications easier and use real physical memory (e.g. RAM) more efficiently than those without virtual
memory. Note that "virtual memory" is more than just "using disk space to extend physical memory size" - that is merely the
extension of the memory hierarchy to include hard disk drives. Extending memory to disk is a normal consequence of using
virtual memory techniques, but could be done by other means such as overlays or swapping programs and their data completely
out to disk while they are inactive. The definition of "virtual memory" is based on redefining the address space with a
contiguous virtual memory addresses to "trick" programs into thinking they are using large blocks of contiguous addresses.

Almost all implementations of virtual memory divide the virtual address space of an application program into pages; a page is a
block of contiguous virtual memory addresses. Pages are usually at least 4K bytes in size, and systems with large virtual
address ranges or large amounts of real memory (e.g. RAM) generally use larger page sizes.
Almost all implementations use page tables to translate the virtual addresses seen by the application program into physical
addresses (also referred to as "real addresses") used by the hardware to process instructions. Each entry in the page table
contains a mapping for a virtual page to either the real memory address at which the page is stored, or an indicator that the page
is currently held in a disk file. (Although most do, some systems may not support use of a disk file for virtual memory.)

If, while executing an instruction, a CPU fetches an instruction located at a particular virtual address, fetches data from a
specific virtual address or stores data to a particular virtual address, the virtual address must be translated to the corresponding
physical address. This is done by a hardware component, sometimes called a memory management unit, which looks up the real
address (from the page table) corresponding to a virtual address and passes the real address to the parts of the CPU which
execute instructions. If the page tables indicate that the virtual memory page is not currently in real memory, the hardware
raises a page fault exception (special internal signal) which invokes the paging supervisor component of the operating system

Embedded systems and other special-purpose computer systems which require very fast and/or very consistent response times
may choose not to use virtual memory due to decreased determinism.


一程序分 File Input 和 Computing 阶段,发现 Input 耗时久如何改进

Answer:
 我答用 multi-threading 输入和计算
 他问如果应用程序本身已经是 MT,而且需要所有 Input ready 才能计算,怎么办?
 我答看是否有 repeatedly reading,keep them in memory,例如文件索引;
 Any more? 答分布式存储,象 Google 的 GFS
 Any more? 答 Use disk cache
 Any more? 答是否 Input 分布不连续,导致磁盘 seeking 时间增长,尽量一次读取临近记录;
 Any more? 答是否文件本身在磁盘上是 fragmented,做 defrag
 Any more? 答是否有其他程序在 competing disk,举例说, Windows XP 同时拷贝两个文件夹的时间远远超过依次
拷贝的时间和


compare two object

我说实现 equalto, 然后问还要考虑什么函数,hashcode().


Multiprocessing, multi-core processor, Parallel computing, concurrency and
Distributed computing

Multiprocessing is the use of two or more central processing units (CPUs) within a single computer system. The term also refers
to the ability of a system to support more than one processor and/or the ability to allocate tasks between them. There are many
variations on this basic theme, and the definition of multiprocessing can vary with context, mostly as a function of how CPUs are
defined (multiple cores on one die, multiple chips in one package, multiple packages in one system unit, etc.). Multiprocessing
sometimes refers to the execution of multiple concurrent software processes in a system as opposed to a single process at any one
instant. However, the terms multitasking or multiprogramming are more appropriate to describe this concept, which is
implemented mostly in software, whereas multiprocessing is more appropriate to describe the use of multiple hardware CPUs. A
system can be both multiprocessing and multiprogramming, only one of the two, or neither of the two.

A multi-core processor is a processing system composed of two or more independent cores. The cores are typically integrated
onto a single integrated circuit die (known as a chip multiprocessor or CMP), or they may be integrated onto multiple dies in a
single chip package. A many-core processor is one in which the number of cores is large enough that traditional multi-processor
techniques are no longer efficient — this threshold is somewhere in the range of several tens of cores — and likely requires a
network on chip.

Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle
that large problems can often be divided into smaller ones, which are then solved concurrently ("in parallel"). There are several
different forms of parallel computing: bit-level, instruction level, data, and task parallelism. Parallelism has been employed
mainly in high-performance computing, but interest in it has grown lately due to the physical constraints preventing frequency
scaling. As power consumption (consequently heat generation) by computers has become a concern in recent years, parallel
computing has become the dominant paradigm in computer architecture, mainly in the form of multicore processors.

concurrency is a property of systems in which several computations are executing simultaneously, and potentially interacting with
each other. The computations may be executing on multiple cores in the same chip, preemptively time-shared threads on the same
processor, or executed on physically separated processors. A number of mathematical models have been developed for general
concurrent computation including Petri nets, process calculi, the synchronous model and the Actor model

Distributed computing: There are several autonomous computational entities, each of which has its own local memory; The
entities communicate with each other by message passing.


embedded system related concepts

An embedded system is a computer system designed to perform one or a few dedicated functions, often with real-time computing
constraints. It is embedded as part of a complete device often including hardware and mechanical parts. In contrast, a
general-purpose computer, such as a personal computer, is designed to be flexible and to meet a wide range of end-user needs.

Firmware is a term sometimes used to denote the fixed, usually rather small, programs that internally control various electronic
devices. Typical examples range from end-user products such as remote controls or calculators, through computer parts and
devices like hard disks, keyboards, TFT screens or memory cards, all the way to scientific instrumentation and industrial robotics.
Also more complex consumer devices, such as mobile phones, digital cameras, synthesizers, etc., contain firmware to enable the
device's basic operation as well as implementing higher level functions. There are no strict boundaries between firmware and
software. However, firmware is typically involved with very basic low-level operations in a device, without which the device
would be non-functional.

Flash memory is a non-volatile computer memory that can be electrically erased and reprogrammed. It is primarily used in
memory cards and USB flash drives for general storage and transfer of data between computers and other digital products.

System-on-a-chip or system on chip (SoC or SOC) refers to integrating all components of a computer or other electronic system
into a single integrated circuit (chip). It may contain digital, analog, mixed-signal, and often radio-frequency functions – all on
one chip. A typical application is in the area of embedded systems. The contrast with a microcontroller is one of degree.
Microcontrollers typically have under 100K of RAM (often just a few KBytes) and often really are single-chip-systems; whereas
SoC is typically used with more powerful processors, capable of running software such as Windows or Linux, which need
external memory chips (flash, RAM) to be useful, and which are used with various external peripherals. In short, for larger
systems SOC indicats technical direction more than reality: increasing chip integration to reduce manufacturing costs and to
enable smaller systems. Many interesting systems are too complex to fit on just one chip built with a process one optimized for
just one of the system's tasks. When it is not feasible to construct an SoC for a particular application, an alternative is a system in
package (SiP) comprising a number of chips in a single package. In large volumes, SoC is believed to be more cost effective than
SiP since it increases the yield of the fabrication and because its packaging is simpler.

A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by customer or designer after
manufacturing—hence "field-programmable". The FPGA configuration is generally specified using a hardware description
language (HDL), similar to that used for an application-specific integrated circuit (ASIC) (circuit diagrams were previously used
to specify the configuration, as they were for ASICs, but this is increasingly rare). FPGAs can be used to implement any logical
function that an ASIC could perform. The ability to update the functionality after shipping, and the low non-recurring engineering
costs relative to an ASIC design (not withstanding the generally higher unit cost), offer advantages for many applications. FPGAs
contain programmable logic components called "logic blocks", and a hierarchy of reconfigurable interconnects that allow the
blocks to be "wired together"—somewhat like a one-chip programmable breadboard. Logic blocks can be configured to perform
complex combinational functions, or merely simple logic gates like AND and XOR. In most FPGAs, the logic blocks also include
memory elements, which may be simple flip-flops or more complete blocks of memory.

An application-specific integrated circuit (ASIC) is an integrated circuit (IC) customized for a particular use, rather than
intended for general-purpose use. For example, a chip designed solely to run a cell phone is an ASIC. Intermediate between
ASICs and industry standard integrated circuits, like 7400 or 4000 series, are application specific standard products (ASSPs).

A Real-Time Operating System (RTOS) is a multitasking operating system intended for real-time applications. Such
applications include embedded systems, industrial robots, spacecraft, industrial control, and scientific research equipment. A
RTOS facilitates the creation of a real-time system, but does not guarantee the final result will be real-time; this requires correct
development of the software. An RTOS does not necessarily have high throughput; rather, an RTOS provides facilities which, if
used properly, guarantee deadlines can be met generally or deterministically (known as soft or hard real-time, respectively). An
RTOS will typically use specialized scheduling algorithms in order to provide the real-time developer with the tools necessary to
produce deterministic behavior in the final system. An RTOS is valued more for how quickly and/or predictably it can respond to
a particular event than for the amount of work it can perform over a given period of time. Key factors in an RTOS are therefore a
minimal interrupt latency and a minimal thread switching latency.

A cyber-physical system (CPS) is a system featuring a tight combination of, and coordination between, the system’s
computational and physical elements. Today, a pre-cursor generation of cyber-physical systems can be found in areas as diverse
as aerospace, automotive, chemical processes, civil infrastructure, energy, healthcare, manufacturing, transportation,
entertainment, and consumer appliances. This generation is often referred to as embedded systems. In embedded systems the
emphasis tends to be more on the computational elements, and less on an intense link between the computational and physical
elements. Unlike more traditional embedded systems, a full-fledged CPS is typically designed as a network of interacting
elements instead of as standalone devices. The expectation is that in the coming years ongoing advances in science and
engineering will improve the link between computational and physical elements, dramatically increasing the adaptability,
autonomy, efficiency, functionality, reliability, safety, and usability of cyber-physical systems. The advances will broaden the
potential of cyber-physical systems in several dimensions, including: intervention (e.g., collision avoidance); precision (e.g.,
robotic surgery and nano-level manufacturing); operation in dangerous or inaccessible environments (e.g., search and rescue,
firefighting, and deep-sea exploration); coordination (e.g., air traffic control, war fighting); efficiency (e.g., zero-net energy
buildings); and augmentation of human capabilities (e.g., healthcare monitoring and delivery).


synchronization and asynchronization

synchronization refers to one of two distinct but related concepts: processes synchronization, and data synchronization. Process
synchronization refers to idea that multiple processes are to join up or handshake at certain point, so as to reach an agreement or
commit to a certain sequence of action. Data synchronization refers to idea of keeping multiple copies of a dataset in coherence
with one another, or to maintain data integrity. Process synchronization primitives are commonly used to implement data
synchronization.

In a synchronous system, operations are coordinated under the centralized control of a fixed-rate clock signal or several clocks.
An asynchronous digital system, in contrast, has no global clock: instead, it operates under distributed control, with concurrent
hardware components communicating and synchronizing on channels.
内存组织分布

        成 对 是 在
调 new生 的 象 放 heap吗 我 得
 用                                        放
                           ? 记 heap存 static和      global variables, uninitialized static and global variables
 放 data segment里 阿 可 interviewer说
才 在               的 。 是                  global和       在
                                                static都 data segment里 。 面
                                                                     rp, 面 目
Answer: http://www.mitbbs.com/article_t/JobHunting/31510787.html 攒 电 题

global is static.
static --> data segment, address is determined at compile time. Initializeto 0.
local variable --> stack
malloc/new --> heap, new calls malloc.

data         得 在 存 的 置 个 在       其 支 分 的 器 才 的 念 。
     segment指 是 内 中 位 , 这 是 x86和 他 持 段 机 上 有 概                  文 中
                                                               在 件 , initialized   static/global
  量 一 叫
变 在 个 .data的section里 设 ELF)。加 到 存 以 , 般
                    (假 是     载 内 中 后 一 .data和     在             ,     是
                                             .bss都 data segment中 heap也 , heap在 们 面
                                                                                 他 后 。

以             例 the
  linux gcc为 : address map of a process is as follows:
(starting from 0x00000000 to 0xbfffffff)
1. text section: program code (read only)
2. data section: initialized global variables, and static vars (in function)
3. bss section: uninitialized global variables\
4. heap section: dynamically allocated memory by malloc (new)
5. shared library or memory mapped files
6. stack: function activation frames and local variables

   有 些 关 要 section, 比
还 一 无 紧 的                   如    rodata                       来 放
                                                             用 存               string           literals。 但 到 上 个 要 sections
                                                                                                           直 以 几 主 的
   面 放 什 就 不 了 以 一 程 测 :
里 都 了 么 差 多 。可 写 下 序 试
int a = 1;
int b;
void foo() { //...}
int main() {
....int c = 3;
....int *d = malloc(5);
....//output &a, &b, &c, &foo, d
}

bss is from the historical name 'Block Started by Symbol'. Since uninitialized vars are automatically set to zero, there is no need to
store the zero's in the program file. When the program loaded in to memory, a zero page is mapped. Thus saves disk space.是        data
          一 分 uninitialized
section的 部 , 放                           static/gloable       variables.          它 好 是 序 译 后 用 这 variables在
                                                                                用 的 处 程 编 好 不 给 些                               object
      分 空 , 程 被
file里 配 间 在 序 load去 行 这 variables才 初 化 0.
                              执 前 些                 被 始 成



Network

TCP vs. UDP

TCP is a reliable protocol, UDP is not.
Why use UDP

For real-time applications.


Procedure of TCP handshake. format of a TCP header.

a) The server is listening at certain port, say 21. The client sends a Sync packet to port 21.
b) After the server received the packet, it bounces back an ACK packet to the client and encapsulates another port, say 1000.
c) Then the client sends another Sync/ACK packet to the server. They talk via port 1000.
d) the packet header contains source port, destination port, sequence number, ACK number, data offset, reserved, flags, window,
checksum, urgent pointer, options, and data.



Linux/unix

difference between find and grep

http://www.geekinterview.com/question_details/49964 : 'grep' and 'find' both are used as matching pattern keywords. Whereas
the difference are of their uses. 'find' command is used in case we need to locate any file name or diresctory name withing a
directory. Whereas the 'grep' command is used to match a text appearing in a file.

Grep usually takes the input from another command (usually content of file or folders etc) where Find works without taking
input from another command. Find is more powerful thatn grep with many options like
-access time
-modification time
-user name
-terminal name
-folder
-file
-character file
etc..
Also with find command we can execute certain actions (like removing the found files etc)


Question 1

unix 里,3 列数据,如何统计第二列里不同元素的个数
Answer: http://www.mitbbs.com/article_t/JobHunting/31527371.html 请教一个 unix 的命令问题
if your data is in t.txt, awk '{print $2}' t.txt | sort -u | wc -l



Unsolved question
1. Given a stack, and 5 variables, reqire you put the 5 variables into the stack. How would you test it before putting them in?
2. After the source file is compiled to binary file, how is the binary file orgnized and stored?
3. Memory leakage: How you ever met memory leakage program, is it cause severe problem? How did you find the leakage?
Have you ever used any software to detect the memory leakage?
4. when inserting a record into a table, how can you let the relatived tables change accordingly?
5. Advantage/disadvantage of void pointer in C.
6. key components in stl
7. how is free(void*) implemented?
8. how to kill a process, how to kill all child processes of a process (kill -1?) how about if that process cannot be killed?
9. How to implement mutiple inherience in Java? Why java use interface while C++ keeps feature of multiple inherience?
10. Can regular expression resolve the problem of nested structures?
11. If I wrote a library in C++, how to handle it such that I can use it in C code?
    My answer was to use extern "C" , then he asked what happens if I don't use it?
12. collection / how to casting data type.
13. tell me about sliding window
14. tell me about the algorithm of Transport Layer
15. lock and deadlock, give the example of them; and how to avoid or deal with deadlock.
16. what is router? what is switch? the difference.
17. about VPN. tell me about the architecture of TCP/IP protocol which used in client/server situation
18. compare: 64M bit file, if use FTP or (another protocol, forget), which one is faster.
19. how much do you know Layer 3.
20. how much do you know Layer 2.
21. 讨论 html vs. xhtml vs. xml
22. 描述在浏览器中敲入一个网址后所发生的事情.dns,cache 等
23. vector vs. arraylist, growth strategy & complexity
24. 在 C++文件中只 declare class A, 但不以任何方式 define class A, 是做什么用
25. how do you test a calculator.
26. java 里面 array 和 vector 的区别
27. 数据库里如何实现(解决?)一对多的关系?
28. Explain the heap corruption
29. Explain the differences between IUnknown & IDispatch
30. describe shortest path algorithm hint: any one at http://en.wikipedia.org/wiki/Shortest_path
31. synchronize threads with critical section Win32 API
32. use Interlocked* Win32 API to implement critical section
33. questions about windows security descriptor
34. design and coding about decyphering character-swapped text
35. 预处理器标识#error 的目的是什么?
36. 嵌入式系统中经常要用到无限循环,你怎么样用 C 编写死循环呢?
37. 关键字 volatile 有什么含意?并给出三个不同的例子。
38. 什么是动态内存管理? free()这个函数的形参是什么?为什么形参里不规定它要释放的空间的大小?
39. template 的 class 可以继承非 template 的 class 吗?为什么。
40. template 的 function 可以有 default argument 吗?为什么
41. 在什么情况下 pass by pointer 比 pass by reference 更有优势?
42. throw exception 的次序 significant 吗?为什么?如果没 catch 到,会有什么情况发生?
43. 线程之间怎么通讯?什么是 critical section/semaphore/mutax,区别?
44. heap/stack 的概念,区别?
45. Deadline 邻近,程序出现重大问题,要么是没时间修改,要么是程序员拒绝修改,怎么办?如果连排在 priority list
中的项目都没时间测试,怎么办?
46. boost pool memory 概念
47. java monitor 概念
48. 什么是 deadlock, 怎么避免?
49. In a single processor system, what is the simplest way to prevent multiple threads(processes) from racing.
50. describe process virtual space;
51. list some fields in process table;
52. what's a device driver;
53. system call 和 function call 的区别.
54. 安全:啥叫 public key encryption, secret key encryption, 说个代表算法。
55. Whenever you open a file, describe all the steps that the OS goes through directory list, inum, inode, data sector.
56. what is Volatile? In what two cases will you use it?
57. what is critical section?
58. How to serialize and deserialize 1) a binary tree and 2) a binary graph?
59. What are stack, stash and queue?
60. what are map, set, multimap, and multiset?
61. What is the advantage to use map instead of list?
62. What is storage draft?
63. Given a file, tell me the steps of reading the file.
64. in what case O(n*2) is better than O(n).
65. What is enum's limitation?
66. What is pthread_mutex?
67. What is the difference between pass by pointer and pass by reference
68. What are BTrees and B+ trees?
69. the gmail page loads very slow. any suggestion for improvement?
Http://www.mitbbs.com/article_t/JobHunting/31487921.html "贴一下我 google 第一轮店面的题目"
70. we want to check the number of querys obtained from the world in the last minute and the last hour, what data structure
should you use for that? If there are billions of records, i.e, too many records for the main memory, what suggestions do you
have? http://www.mitbbs.com/article_t/JobHunting/31487921.html "贴一下我 google 第一轮店面的题目"
71. find all phone numbers in the html pages in a folder (and subfolder).
Http://www.mitbbs.com/article_t/JobHunting/31493961.html "刚刚 amazon 2nd phone interview"
72. Imagine you have data being pulled very frequently from a large database. How would you design a MRU (most recently
used) cache? Implement getValue(int id) for the MRU cache. http://www.mitbbs.com/article_t/JobHunting/31501147.html
"Goog questions"
73. How       would      you     write    a    benchmark      to    calculate  the     time     of    each    memory    access?
http://www.mitbbs.com/article_t/JobHunting/31501147.html "Goog questions"

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:36
posted:9/20/2011
language:English
pages:31