Crypto. Metaverse. Business. Technology. Finance. Investing. Economy.
General Technology for AI.
A completely different kind of computer technology.
This article will give you an overview of the entire technology space in AI.
General buzzword reference. There is some structure and organization to the terms thrown about.
Here are some industry products from Big Tech using AI.
What to expect from this article.
Computer is a state machine with switches turned on and off. In imperative programming there is a flow chart. With Neural Networks, there is an output based on weights and biases.
When you interact with a computer program you expect a response. Whether it is AI or website or mainframe or workstation. There is a lot of stuff going on behind the scenes though for this to happen.
This article gets into the process, software, hardware, theory, methods. Is there structure, organization, layers, hierarchy, interaction going on.
Just take a quick look and can refer to more details in the future.
Programming and Theory.
Older Programming Techniques.
Old technology was about information going through a workflow. Information was changed and processed through a series of steps. Assignment operation. Decision if else. Loops for while.
The difference with AI is that the entire information set is sent to a model which gives an output based on biases, layers, weightings.
So this style of programming is not generally used for complex calculations while it can be used to check if certain values are fraudulent aberrations.
Types of AI.
Machine Learning. Deep Learning. Neural Network.
Cognitive Computing. Simulate human thought. Expert Systems. Rules and facts to make decisions. Fuzzy Logic. Degrees of truth to model reasoning.
Robotics. Movements, functions automatically with speed and precision. Computer Vision. Meaningful info from digital images and videos. Natural Language Processing. Understand regular text and spoken word.
Elements of Intelligence.
Reasoning. Learning. Problem Solving. Perception. Linguistic Intelligence.
General Overview of how AI is different and fits in with Computing.
Process.
AI Technology.
Different types. AI. Learn, reason, generalize, meaning for a machine. Neural Nets. Interconnected nodes in layered structure. Machine Learning. Using data for prediction rather than explicit. Deep Learning. Multiple layers of neural networks for higher level.
Process Steps.
Choosing Data. Cleaning data. Prepare data. Choosing Model. Tuning model. Training. Prepare model. Choosing Tests. Checking predictions.
Hardware Platform. Choosing the Chips. Computer server architecture. Or Cloud. Choosing the OS. Choosing Dev Ops. Choosing the Use Cases and applications. Output.
Software Engineering. Stack and platform. IDE. programming language. support. Architecture. Design. Process. Tools. GUI Interface. Data storage. Testing. Deployment. Software as a service. Availability.
Terms for Training.
Confusion Matrix. For testing. Evaluation metric for actual vs predicted. Convergence. Error range of loss is close to final value. Metrics. Accuracy, Loss, Confusion Matrix, AUC, MAE, RMSE, R Square.
Model Drift and Decay. New unseen data and assumptions change results. Model Training. Best combination of weights, bias over prediction range. Over Fitting. Inaccurate prediction for new data. Training is fine. Under Fitting. Relation between input, output not captured in training. Reproducibility. Similar results for same dataset in same environment. Transfer Learning. Reuse of trained model on a new problem.
Data Sets. Epochs. Batch size. Iterations. Information for training. Synthetic data. Artificially generated using algorithms. Structured data. Standard format, order, accessible, structure for model. Unstructured data. Large collections with internal structure, not model.
More Training. Terms from Differentiable Computing on Wiki.
Tensor Calculus. Vector calculus applied to tensor fields. Computational learning theory. Machine learning algorithms and supervised learning. Inductive bias. Assumptions to predict outputs after learning. Diffusion. Continuous time random processes for continuous sample paths.
Tokens. Tokens to vectors with token sequence position itself vectorized. Separating a piece of text into smaller units like words, characters or sub words. Tokenizer. Form of compression like dictionary coding.
General concepts of Process, Software, Hardware, Model, Training.
Using Information.
Training Concepts.
Activation. Softmax. Sigmoid. Rectifier. Output of the node based on input.
Gradient descent. SGD. Iterative algorithm for optimization. Optimization means finding a local minimum and minimize loss function. Loss functions. Price paid for inaccuracy of predictions in classification problems. Backpropagation. Test for errors working back from output nodes to input nodes.
Attention. Focusing on some part of the input data. Normalization. Common data scale without shape distortion. Regularization. Changes the result answer to be simpler; not out of bounds. Augmentation. Used to reduce overfitting.
Diffusion. Sort of random process with continuous time and paths. Autoregression. Random process used to describe time varying processes in nature.
Datasets. Collection of information for training. Clustering. Grouping of similar objects in a group. Regression Analysis. Overfitting. Relationship between independent variables and dependent variables. Convolution. How the shape of one function is modified by other function.
Adversary. Attacks and defenses for machine learning algorithms. Hallucination. AI response not based on training data.
More Concepts in Training the Model using Data, Cycles, Results.
Development Process.
Software Engineering for AI.
Frameworks: TensorFlow. PyTorch. Keras. Theano. JAX. Fast.ai. Microsoft Cognitive Toolkit. mxnet. gluon. chainer. paddle paddle. DL4J. Caffe. Languages: Python. Julia. Swift. Libraries: NumPy. SciPy. Pandas. Matplotlib.
Tensor Board. Measurements, visualizations for training workflow, progress, performance.
Hardware Specialized for AI.
GPU Graphics chips are generally used, not the regular CPU. Also ASICs and FPGA. Edge devices like smartphones. Memory 512 GB. Embedded high bandwidth memory. TPU. Tensor Processing Unit. Accelerates performance of linear algebra computation. ASIC from Google. Storage. Local NVM drives. Network. 10 Gbps or higher Ethernet. CPU. For the VM Container.
Memristor. Memory resistor is a two terminal electrical component relating charge to magnetic flux. SpiNNaker. Spiking neural network architecture is a massively parallel manycore supercomputer.
Different Models for Neural Networks.
Model Zoo. Model Catalog. Huge number of Models to choose from.
Dev Ops for ML and Automation of development.
CI/CD for ML. Continuous integration, delivery, deployment to deliver apps frequently. Containers. Software package with necessary elements to virtualize OS and run in any environment. Jupyter Notebooks. Browser tool for live code, visualizations, text. Kubernetes. Operational tasks, container management, deployment and monitoring. Different steps in ML Ops. Data gathering, analysis, preparation and model training, validation, serving and monitoring. Model Deployment. Last stage where model is placed into production environment. Serverless ML. Loosely coupled serverless services that provide compute and storage for AI system. Services run pipelines.
Development Environment: Software, Hardware, Models, Dev Ops.
Model.
Parameters of Model.
Weights. Multiplication factor. How much input is passed. Bias. Constant addition value. Activation function. Will it fire. Step Function. ReLU. SoftMax. Action Potential. Refractory Period. Waiting period.
Which parameters are tuned. Objective function. Gradient descent. Gradient Boosting. Gradient problem. Training problems. Low diminishing values.
Hyper parameters for Training.
Number of Epochs. Batch size. Pooling size. Number of Hidden Layers. Number of neurons per layer. Activation functions. Loss function. Learning rate. Regularization. Optimizer. Number of Clusters for clustering. Filter size for CNN.
Paradigms for Neural Networks.
Supervised learning. Data has labels for specific labels. Unsupervised learning. Untagged data. Mimicry. Reinforcement Learning. No labeled data. Rewards for behavior.
Online Learning. Sequential data rather than batch data. Batch Learning. All data, just a sample, mini groups.
Semi Supervised Learning. Small label data. Large unlabeled. Self Supervised Learning. Unlabeled data.
Back Propagation. Vanishing gradient problem. Find tuning weights based on error rates. Feed Forward Network. Not circular loop.
Problems solved by Neural Networks.
Classification. Generative Model. Regression. Clustering. Basically separation into groups.
Density estimation. Probability density function of the population. Anomaly detection. Data that does not meet pattern. Dimension Reduction. Compression of data.
Association Rules. Relations between variables. Structured Predictions. Structured objects and not scalar or tensor. Learning to Rank. Sort and order of results.
Semantic Analysis. Building structures of concepts for understanding. Grammar Induction. Rules based on observations. Ontology learning. Relation of terms and concepts.
Auto ML. Data Cleaning. Feature engineering. Feature learning. Set of information used to train model for subject interest of output. Multimodal learning. Text or image or audio.
Problems Encountered by Neural Networks.
CPU time and energy. Amount of hardware needed. Choosing the right model and number of layers.
Material needed for Information and training. People needed for cleaning up, labels, classification or reinforcement.
Hallucination. Finding things that are not there. Bias. Ethics. Morality.
Generalizable predictions. Overall rules and philosophy based on data. Interpretable/Explainable. Rigorous explanation of solution. Physics engines. Plugging in modules and data with custom rules.
Kinds of Architectures.
General AI. Specialized AI. Edge Computing using IoT with AI. Distributed or Parallel AI. Swarm Intelligence with AI. Layers and N-Tier of AI. Cloud with AI server. Control Layer in between neural networks such as physics engine.
Model Training, Tuning. Learning. Problems Solved, Created. Architecture.
Architectures.
Supervised Learning.
One type of Machine Learning.
Apprenticeship Learning. By observing an expert.
Decision Trees. Tree model for both classification and regression.
Ensembles. Bagging. Boosting. Random Forest. Multiple learning algorithms with flexible alternative models.
k-NN. k-nearest neighbor for classification and regression. K is hyperparameter. Naive Bayes. Classification problems. Perceptron. Single layer neural network.
Linear Regression. Linear relationship for model. Logistic Regression. Using an equation with scalar variables.
Relevance Vector Machine. Support Vector Machine. Classification and regression.
More Types of Learning.
Architecture.
Artificial Neural Network.
One of the most interesting areas.
Cognitive Computing. Machine learning, reasoning, NLP, speech, vision. Deep Learning. Neural networks with representational learning.
Auto Encoder. Efficient coding of unlabeled data. Shallow for encoder and decoder. Or compression and decompression. SOM. Unsupervised ML. Reduce dimensions. Clustering, dimension reduction, feature extraction. Need to adjust weight vectors of the neurons.
RNN. Nodes can create a cycle because of feedback loops. For temporal sequential data. Feedback for memory. LSTM. RNN with Feedback for data streams. Gate for long term dependency for vanishing gradient problem. GRU. LSTM with a forget gate. Audio modeling. For gradient descent. ESN. Sparsely connected hidden layer. Signal processing. simple rnn to avoid gradient descent. Reservoir Computing. Fixed non linear reservoir. higher dimension. lower cost. Temporal and sequential. Echo state networks. Liquid state machines Bidirectional RNN. Input in both forward and backward directions. Deep RNN. Multiple hidden layers.
Multi Layer Perception. Feed forward artificial NN. GAN. Generative adversarial network. Two networks.
Convolutional Neural Network U-net. For visual spatial imagery. tasks. Mask and filter to identify local features and then stack in layers. There are many variations of CNNs, including LeNet, AlexNet, ResNet, GoogleNet/InceptionNet, MobileNetV1, ZfNet, VGG, PolyNet, Inception v2/v3/v4 and Inception-ResNet, DenseNet, Pyramidal Net, Xception.
Transformer. Deep learning with self attention. For NLP and CV. Can look at entire input and relevance among inputs and outputs. Limit input length. Focus on different parts of input sequence based on context.
Spiking Neural Network. Mimic natural. Fires at membrane potential rather than continuous values. Mem Transistor. Electronic component. Single or dual gates. Electrochemical RAM ECRAM. Non volatile memory. Multiple levels per cell.
Data Sets. Collection of data used to train the model. Distributed Training. Split up the work load to train the model and share with multiple mini processors that work in parallel to speed up. ETL. Extract transform load to import export data from databases.
Deep Dream. Computer Vision Program. Developed by Google using CNN. Restricted Boltzmann machine.
Others.
NMT. Neural machine translation for entire sequences in a single integrated model. Echo State Network. RNN with sparsely hidden layer. MLP. Plain fully connected feedforward ANN. Residual Network. A flow network. Variational Autoencoder. A type of neural network. Graph Neural Network. Neural networks represented as graphs. Pairwise message passing by exchanging information with their neighbors.
Types of Architectures for the Models.
Applications.
AI and Commercial Products.
AlphaFold. Predictions of protein structure. LangChain. Software development framework for LLM. Word2Vec. Word associations for NLP using large amount of info. Seq2seq. Machine learning approaches for NLP.
MidJourney. Dall-E. Stable Diffusion. Generative AI for images. Bert. Language models at Google for NLP with transformer architecture. Chinchilla AI. LLM from DeepMind after Gopher. BLOOM. BigScience Large Open science Open access Multi LLM. Decoder only transformer based free LLM. LLaMA. Large Language Model Meta AI. From Meta. LaMDA. Language Model for Dialogue Applications. Conversational AI and used by Bard. PaLM. Pathways language model. 540 billion parameter transformer LLM from Google.
Anthropic. Former members of OpenAI started AI company in 2022 with support from Google. EleutherAI. Non profit research group for open source Chat GPT. DeepMind. Research lab for Google based in UK. Hugging Face. American company with transformers library for NLP. Mila. Quebec AI Institute for Machine Learning. MIT CSAIL. Computer Science and Artificial Intelligence Lab for research. Very large lab. OpenAI. Research lab founded in 2015 with support from Elon Musk and Microsoft.
Products Available: Commercial and Open Source.
Thank You !
Medium pays for Full reads. Follows. Claps, highlights, responses. Boost. List of interesting Articles.
Disclaimer. Edu and Info only. No Legal Investment Liability. Ask for Alerts. Medium Mentions may trigger Earnings Algo. References. wikipedia.





